With that in mind, let's take a look at this code:
void BltLinear(LINEAR_BITMAP far * BM, int x, int y, UINT8 far * ScreenBase)
{
int Top; /* coordinate values of bitmap top-left corner */
int Left;
int BltWidth; /* width of bitmap so we don't dereference pointers */
int BltHeight; /* height of bitmap so we don't dereference pointers */
UINT16 TempOffset; /* temp variable to calc far pointer offsets */
UINT8 far * Screen; /* pointer to current screen position */
UINT8 far * Bitmap; /* pointer to current bitmap position */
unsigned WidthCounter;
unsigned HeightCounter;
unsigned ScreenIncrement;
assert(LinearBM != NULL);
assert(ScreenBase != NULL);
//Compute our left and top starting points
Left = x - BM->OriginX;
Top = y - BM->OriginY;
//Compute our Screen location
TempOffset = Top * ScreenWidth + Left;
Screen = ScreenBase + TempOffset;
//Computer our bitmap pointer
Bitmap = &(BM->Data);
//Alias pointers
BltWidth = BM->Width;
BltHeight = BM->Height;
//How much should we increment the screen inside of the loop
ScreenIncrement = ScreenWidth - BltWidth;
for (HeightCounter = 0; HeightCounter < BltHeight; HeightCounter++)
{
for (WidthCounter = 0; WidthCounter < BltWidth; WidthCounter++)
{
if (*Bitmap != 0)
{
*Screen = *Bitmap;
}
Screen++;
Bitmap++;
}
Screen += ScreenIncrement;
}
}
[size="3"]Always flip and unroll loops
We can do this and other "classic" optimizations like that to improve our code. This may seem obvious, but it is worth mentioning.
[size="3"]Don't use far pointers to bitmaps unless you need to
Unless you are going to be blitting very large(64K or over) bitmaps, don't use far pointers to bitmaps. Far pointer operations are always slower then near pointer operations. Far pointers are composed of a segment and an offset, while near pointers are only composed of an offset.
[size="3"]Use register variables for the loop counters whenever possible
We can turn the loop counters into register variables. Those are probably the best ones to use registers for because they are frequently acessed. To do this, we just need to add a register keyword in the declaration of the counter variables.
[size="3"]Don't use assert's!
Assert'ing something might be good debugging practice, but it does not help the execution time for our blitter. You can either take out the assert's completely, or you can just add a #define NDEBUG when you compile the final version. If you don't know already, a assert macro expands to if statement. That means for each assert we have a conditional jump. Conditional jumps are slow.
[size="3"]Use global variables instead of passing parameters
You can use global variables's to speed up this function. Instead of passing parameter's, you just set the value of the global variables. This may seem odd, but it works. Almost everyone has heard somewhere or read somewhere that global variables is a no-no, but some rules have to be broken when you are a game programmer.
[size="3"]Don't use variables where you don't need them!
In the code, it references the variable screenwidth. While this may seem to make sense, it is better to hard-code in a value(in this case 320). Since this function will only work in mode 13h, why will supporting different resoulutions help? Also, if you don't use that variable, you don't have to set it, which makes reusing the code in another game easier.
[size="3"]Take out as many features as possible!
This is important, albeit not quite as much in this particular function. You don't always want to use a blitter packed with features you don't need. For instance, say you have a blitter which can do clipping, rotating, scaling, skewing, and dithering, but you just need a simple blitter. You should use the blitter that has the least overhead. The less features the faster. In this example, we can take out the Left and Top variables because all they do is subtract OriginX from X. While that may be helpful in some cases, it is hardly nessesary.
[size="3"]Use the smallest size variables possible
Take a look at the declaration for tempoffset. The type UINT16 means unsigned long, or unsigned 16 bit thingie. Anyway, you only need a integer to hold that value, because the max for it is 64000, and an integer can hold that plus 1 thousand odd more.
[size="3"]Don't pass struct's as parameters!
In the code you pass a structure to the function. Bad idea! You have to push and pop the whole structure off and on the stack. This means a serious slowdown. What you should do is make the width and height global parameters, and pass the bitmap as a unsigned char. This makes it faster even more so because you can take out the assignments that alias the pointers! Passing structures as parameters is a bad idea in any time-critical function, becuase you have to push and pop each member.
[size="3"]Don't pass the screen as a parameter!
You want to have the screen as a global variable. Since the adress of the screen does not change, You don't need to make a new far pointer each time you blit. This increases the speed twofold, since you also need to push and pop one less parameter.
[size="3"]Always write a transparent version and a non-transparent version
You want to write two versions because if you know something is non-transparent, you can use a non-transparent blitter. Non-transparent blitters are much faster because there is a conditional jump statement that is not executed. This also relates to not putting in every feature.
[size="3"]Use compiled sprites
A compiled sprite is a function that is created in inline assembly by a program called a sprite compiler. A compiled sprite function is simple to use. You just call a function and it draws itself. Compiled sprites take out the tests for transparency, all conditional and unconditional jumps, and are generally assembly code. Use them whenever possible. Soon I will have a section on that, but it's under construction.
[size="3"]Don't use pixel plot functions
Don't use pixel plot functions inside blit functions. The only time that is acceptable is when the function is a macro and not actually a function. Our example does not have this shortcoming, but many blitter's do.
[size="3"]Never use the BGI in any shape or form
The same goes for the Microsoft libraries. They are just to slow! You might you them to demonstrate a concept or for an example on game AI, but Never in a game. Especially not in our blitter. This does not apply to our example blitter, but it is fairly important. No sucessfull game has ever been written using either library.
[size="3"]A final note
I would like to thank you for taking the time to read this article. As final advice, never stop looking for faster ways to do things. I am sure that these are not all. Please Email me with your tip, tricks, and advice. That's all.
Copyright 1996, David Berube