Advertisement

Optimization Question

Started by April 25, 2000 12:01 AM
34 comments, last by Qoy 24 years, 7 months ago
No matter how much I hate to ask... I have a problem with optimization. I have written my own bitmap drawing functions, and when I draw about 4 or 5 enemies on the screen it takes the framerate down a LOT. Talking from 86 FPS to about 40... I was wondering if anybody thinks there''s anything I can do to optimize this function:

// draws a bitmap with no clipping
void DrawBitmap(Bitmap* bitmap, const int x, const int y, RECT* srcRect)
{
	short* picBuffer = bitmap->bits;  // the buffer for the bitmap data
	int srcWidth = srcRect->right - srcRect->left,  // width of the src rectangle
		srcHeight = srcRect->bottom - srcRect->top;  // height of the src rectangle

	// point videoBuffer to the dest coords on the surface
	videoBuffer = ((short*)desc.lpSurface) + (x + (y * pixelLPitch));
	// assign the starting position in the bitmap
	picBuffer += srcRect->left + (srcRect->top * bitmap->width); 
	// now the buffers point to the correct memory

	// for every line
	for(int height = 0; height < srcHeight; height++)
	{
		// for every pixel
		for(int width = 0; width < srcWidth; width++)
			// if the pixel isn''t transparent, then plot it
			if(picBuffer[width] != transparentColor)
				videoBuffer[width] = picBuffer[width];
		// advance the pointers to the next line
		videoBuffer += pixelLPitch;
		picBuffer += bitmap->width;
	}
} // end DrawBitmap
 
I have a clipping version too, but it''s the same with some clipping stuff beforehand... Thanks! ------------------------------ Jonathan Little invader@hushmail.com http://www.crosswinds.net/~uselessknowledge
Do while loops are a little bit faster than for loops. And if any of your enemies are not transparent then you should make another function for that case.

*** Triality ***
*** Triality ***
Advertisement
Everything I''m drawing is transparent, so I don''t need a non-transparent function. And, instead of code structure stuff (while vs. for loops) I was thinking of something in the "algorithm" itself. I doubt anything is there to be optimized, but I''m just thinking it''s weird, because when I turn on compiler optimizations, it goes back up to 86 FPS, so it seems the compiler can find something...
That algorithm is O(n^2), because you have a loop nested within another loop. Very slow.

Usually you''d do something like this:
DWORD ScreenLine = Top * ScreenPitch;DWORD PicLine = 0;DWORD PicDelta = (Right - Left) * (ScreenBits >> 3);for( Y = Top; Y < Bottom; Y ++ ){    memcpy( &VideoBuffer[ScreenLine + Left], &PicBuffer[PicLine], PicDelta );    ScreenLine += ScreenPitch;    PicLine += PicDelta;}

That''s as fast as DirectDraw''s BltFast, but it doesn''t allow for transparency. You''d have to write an assembly blitter and optimize it (that''s a good idea even for opaque sprites). I''m not sure how the transparency code would be done, though.

~CGameProgrammer( );

~CGameProgrammer( ); Developer Image Exchange -- New Features: Upload screenshots of your games (size is unlimited) and upload the game itself (up to 10MB). Free. No registration needed.
CGameProgrammer, it''s not n^2 unless you define n to be the average width of an image. It''s actually linear in the total size of the bitmap ( width*height, which is where the double loop comes from ).
Speeding it up - the only thing I could think of is to RLE the transparency data so you can skip large parts..


#pragma DWIM // Do What I Mean!
~ Mad Keith ~
**I use Software Mode**
It's only funny 'till someone gets hurt.And then it's just hilarious.Unless it's you.
is picBuffer coming from video memory? you could be stalling the video card on the reads, or maybe shootin'' too much junk over the bus (from video ram back to the CPU, when you do "picBuffer[width]").
Advertisement
Honestly, I doubt it has ANYTHING to do with memory. It''s the fact that your innermost loop contains an ''if''. Remember, ifs mean branch instructions, which means stalling and flushing the pipeline. Putting one in a loop to be executed that many times is a recipe for slowdown. And if the compiler can unroll the loop and schedule things a little better, it would explain the change in a release build.

As for fixing it... You may want to see if there''s some way to avoid the if by using masking operations that will achieve the same end result. It all depends on how your colors are stored.

-Brian
Use Blt, there''s no better way.
Also surfaces in video memory with blt are very fast, since it''s all accelerated and the CPU does virtual squat.


The_Minister
[email=mwronen@mweb.co.za" onmouseOver="window.status='Mail The_Minister'; return true" onmouseOut="window.status=' '; return true]The_Minister[/email]1C3-D3M0N Interactive
quote: Original post by The_Minister

Use Blt, there''s no better way.


If by the best way, you mean the fastest, you are wrong. Compiled sprites still whoop the crap out of blits. Basically, what you do is write a program that takes your sprites and turns them into code. Then you just call that function to display them. Technically, it runs in O(1) time. RLE has been said to run faster than Blits also, and doesn''t take precomputation. But then we are into linear with respect to non-transparent pixels.
However, if by best you mean easiest and laziest way, then you are correct.

Mike
"Unintentional death of one civilian by the US is a tragedy; intentional slaughter of a million by Saddam - a statistic." - Unknown
I somehow doubt that you could get straight CPU-only asm code faster than a hardware supported Alpha blit myself...

Did some fast coding once upon a time, and found it sucked. Using other people''s stuff is much more efficient :-).

So basically: if you have HW support for your transparent blits ( or alpha blits ) use that, if you don''t, use a highly optimised ASM version for the software side of things.


#pragma DWIM // Do What I Mean!
~ Mad Keith ~
**I use Software Mode**
It's only funny 'till someone gets hurt.And then it's just hilarious.Unless it's you.

This topic is closed to new replies.

Advertisement