Advertisement

I'll show you mine, if you'll show me yours

Started by July 03, 2000 09:11 AM
13 comments, last by DeltaVee 24 years, 5 months ago
I am writing a 360 degree top down shoot-em-up (why not?). So basically I have written a routine that rotates a map segment (which is pre-rendered on a surface) and blts it onto another surface. Now the only problem is it is a bit slow. Any body have any faster methods? Or suggestions? on a 433 Celeron the blt takes 87 milliseconds to run. destination blt size = 480 x 480 16 bpp source surface is approx 700 x 700. code compiled with Visual Studio 6. Compiler optimizated for speed. Here is the code:
    
bool MyRotateBlt(CRect cDest,LPDIRECTDRAWSURFACE7 lpDest,CRect cSrce,LPDIRECTDRAWSURFACE7 lpSrce,double dAng)
{
	// lock the destination surface and calculate metrics

	DDSURFACEDESC2 DestDesc;

	ZeroMemory(&DestDesc, sizeof(DestDesc));
	DestDesc.dwSize = sizeof(DestDesc);

	HRESULT result = lpDest->Lock( NULL, &DestDesc, DDLOCK_WRITEONLY, NULL );
    if( result != DD_OK )
		return false;

	int nBytesPerPlane = DestDesc.ddpfPixelFormat.dwRGBBitCount / 8;

	LONG lDestPels = DestDesc.lPitch / nBytesPerPlane;
	LONG lDestPelsLeft = lDestPels - cDest.Width();
	LPWORD lpDestMem = (LPWORD)DestDesc.lpSurface;

	// lock the source surface and calculate metrics

	DDSURFACEDESC2 SrceDesc;

	ZeroMemory(&SrceDesc, sizeof(SrceDesc));
	SrceDesc.dwSize = sizeof(SrceDesc);

	result = lpSrce->Lock( NULL, &SrceDesc, DDLOCK_READONLY, NULL );
    if( result != DD_OK )
		return false;

	LONG lSrcePels = SrceDesc.lPitch / nBytesPerPlane;
	LPWORD lpSrceMem = (LPWORD)SrceDesc.lpSurface;

	// pre-calculate various metrics

	long double sinAng = sin(DEGTORAD(dAng));
	long double cosAng = cos(DEGTORAD(dAng));

	int nSrceCX = cSrce.left + cSrce.Width() / 2;
	int nSrceCY = cSrce.top + cSrce.Height() / 2;

	int nCenterMemLoc = nSrceCX + nSrceCY * lSrcePels;

	long double nDestLeft = - cDest.Width() / 2;
	long double nDestRight = nDestLeft + cDest.Width();
	long double nDestTop = - cDest.Height() / 2;
	long double nDestBottom = nDestTop + cDest.Height();

	int nXDestOff = cDest.left;
	int nYDestOff = cDest.top * lDestPels + nXDestOff;

	long double sinY;
	long double cosY;
	for (long double nY = nDestTop ; nY < nDestBottom; nY ++ , nYDestOff += lDestPelsLeft)
	{
		sinY = nY * sinAng;
		cosY = nY * cosAng;
		for (long double nX = nDestLeft ; nX < nDestRight ; nX ++)
		{
			// perform transformation here

			int nSX = (int)(nX * cosAng + sinY);
			int nSY = (int)(nX * sinAng - cosY);

			lpDestMem[nYDestOff ++] = lpSrceMem[nCenterMemLoc + nSY * lSrcePels + nSX];
		}
	}

	lpSrce->Unlock(NULL);
	lpDest->Unlock(NULL);

	return true;
}

    
1. The error trapping sux, please no comments on that. 2. I am using long doubles because that is the size of the reals on the FPU (80 bits) there for there are no conversions needed to load them into the FPU registers. (Mixing real and integer arithmatic allows the compiler to create code that runs on the FPU and CPU at the same time, effectivly parallel processing) 3. in 8bpp there is absolutely no increase in performance as it takes just as long to move a byte as it does to move a word. 4. in 24 bpp there is a 30 percent performance hit as a byte and a word need to be moved (no assembler to move 24 bits at a time). 5. both surfaces are in system memory. 6. There are no comments because my code is beautiful and is self-documenting 7. If you are a newbie, you may wet yourself in excitment. Look free code that does stuff -------------------------- Carpe Diem
D.V.Carpe Diem
Check out the rotation code for my game (http://arkia.tripod.com). I have a rotation function in it that is unbelievably optimized. The y loop isn''t too good, but the x loop is perfect. Since I used fixed point I converted the multiplies in:
int nSX = (int)(nX * cosAng + sinY);
int nSY = (int)(nX * sinAng - cosY);
lpDestMem[nYDestOff ++] = lpSrceMem[nCenterMemLoc + nSY * lSrcePels + nSX];

to basically 2-4 adds and 2 if''s. That''s a good reason to use fixed point huh? Look in ddbitmap.cpp in the PlgCore16 for the source code to the x loop.
For a good time hit Alt-F4! Go ahead try it, all the cool people are doing it.
Advertisement
Skips some lines, that might improve readability which might make it easier to find some errors.

-----------------------------

A wise man once said "A person with half a clue is more dangerous than a person with or without one."
-----------------------------A wise man once said "A person with half a clue is more dangerous than a person with or without one."The Micro$haft BSOD T-Shirt
Now I''m far from the best coder out there, so I don''t know if there is a better algorithm for what your doing (I''d take blue-lightning up on his offer). But I see a few things right off the bat.

1) Instead of dividing by 2 or 8, why not use bit shifts?

2) In the last for loop, you are doing the following:
int nSX = (int)(nX * cosAng + sinY); int nSY = (int)(nX * sinAng - cosY);

the "cosAng + sinY" and "sinAng - cosY" don''t change during the lifetime of the loop, so you could save them in temp vars before entering and stand to save an addition and subtraction per cycle.

3) From what I''ve heard, sin() and cos() are quite slow. People seem quite fond of throwing the results in a look up table. Maybe that would help.

Ok, so overall this isn''t speeding much up, but I tried, right?

-> Briar LoDeran <-
I have some sprite-rotating code on my site that seems to be pretty fast... it could use a little more optimizing, but I think it''s OK.

lntakitopi@aol.com - http://www.geocities.com/guanajam
-- Use cos/sin table lookups instead of calling the functions themselves...

-- instead of dividing by 2, shift your value to the right by 1... and when dividing by 8, shift your value to the right by 3

-- don''t use a nested FOR loop if possible...

-- try to convert the loop in assembler code

-- set VC''s Build option to Release and not Debug mode...

that should speed your code up a bit.. might even use fixed point math, replacing the double/float''s... that''s all i can say by just glancing at your code...

Hope this helps!

..-=ViKtOr=-..
Advertisement
Hmm... i cant believe someone didn''t suggest this - i have a horrible feeling im wrong but:

    bool MyRotateBlt(CRect cDest,LPDIRECTDRAWSURFACE7 lpDest,CRect cSrce,LPDIRECTDRAWSURFACE7 lpSrce,double dAng)    


why dont you pass pointers to the directdraw surfaces to the function? please correct me if im wrong in this...



Cool! a life? Where can I download one of those then?
jumble-----------
Actually,
LPDIRECTDRAWSURFACE7 is a typedef for a long pointer to an IDirectDrawSurface7.


Told you I was wrong

What I usually do is declare the LPDIRECTDRAWSURFACE and then when passing it to a function plonk something like IDirectDraw7 *pdd into my prototype.

Edited by - jumble on July 6, 2000 7:24:27 PM
jumble-----------
Thanks for all the suggestions. I bought a six pack and sat down and optimized just about everything.

I changed all the floating arithmetic fo fixed arithmetic (with bit shifts)

I picked up the fixed macros from another development web site, I forget for now.

The result is that the code now runs a whopping 4 times faster! Now it takes 21 ms to rotate a 480 * 480 16bpp surface on a 433 Celeron.

For those who are interested. This routine blts a ROTATED viewport of a surface and blts that onto another surface, kinda like a reverse rotation. I.e source is a rectangle that is rotated x degrees, the destination is a standard rectangle (same dimensions). So it is NOT a sprite rotation routine

        typedef long fixed;			// Our new fixed point type.#define itofx(x) ((x) << 8)			// Integer to fixed point#define ftofx(x) ((x) * 256)			// Float to fixed point#define dtofx(x) ((x) * 256)			// Double to fixed point#define fxtoi(x) ((x) >> 8)			// Fixed point to integer#define fxtof(x) ((float) (x) / 256)		// Fixed point to float#define fxtod(x) ((double)(x) / 256)	        // Fixed point to double#define Mulfx(x,y) (((y) * (x)) >> 8)		// Multiply a fixed by a fixed#define Divfx(x,y) ((y << 8) / (x))    		// Divide a fixed by a fixedbool MyRotateBlt(CRect cDest,LPDIRECTDRAWSURFACE7 lpDest,CRect cSrce,LPDIRECTDRAWSURFACE7 lpSrce,int nAng){	// lock the destination surface and calculate metrics	DDSURFACEDESC2 DestDesc;	ZeroMemory(&DestDesc, sizeof(DestDesc));	DestDesc.dwSize = sizeof(DestDesc);	HRESULT result = lpDest->Lock( NULL, &DestDesc, DDLOCK_WRITEONLY, NULL );    if( result != DD_OK )		return false;	int nBytesPerPlane = DestDesc.ddpfPixelFormat.dwRGBBitCount / 8;	LONG lDestPels = DestDesc.lPitch / nBytesPerPlane;	LONG lDestPelsLeft = lDestPels - cDest.Width();	LPWORD lpDestMem = (LPWORD)DestDesc.lpSurface;	// lock the source surface and calculate metrics	DDSURFACEDESC2 SrceDesc;	ZeroMemory(&SrceDesc, sizeof(SrceDesc));	SrceDesc.dwSize = sizeof(SrceDesc);	result = lpSrce->Lock( NULL, &SrceDesc, DDLOCK_READONLY, NULL );    if( result != DD_OK )		return false;	LONG lSrcePels = SrceDesc.lPitch / nBytesPerPlane;	LPWORD lpSrceMem = (LPWORD)SrceDesc.lpSurface;	// pre-calculate various metrics	nAng = (nAng + 36000) % 360;	fixed sinAng = sinarray[nAng];	fixed cosAng = cosarray[nAng];	int nSrceCX = cSrce.left + cSrce.Width() / 2;	int nSrceCY = cSrce.top + cSrce.Height() / 2;	int nCenterMemLoc = nSrceCX + nSrceCY * lSrcePels;	fixed nDestLeft		= itofx(- cDest.Width() / 2);	fixed nDestRight	= nDestLeft + itofx(cDest.Width());	fixed nDestTop		= itofx(- cDest.Height() / 2);	fixed nDestBottom	= nDestTop + itofx(cDest.Height());	int nXDestOff = cDest.left;	int nYDestOff = cDest.top * lDestPels + nXDestOff;	fixed sinY;	fixed cosY;	for (fixed nY = nDestTop ; nY < nDestBottom; nY += 256 , nYDestOff += lDestPelsLeft)	{		sinY = Mulfx(nY,sinAng);		cosY = Mulfx(nY,cosAng);		fixed nX = nDestLeft;		int nW = cDest.Width();		while (nW --)		{			// perform transformation here			int nSX = fxtoi(Mulfx(nX,cosAng) - sinY);			int nSY = fxtoi(Mulfx(nX,sinAng) + cosY) * lSrcePels;			*lpDestMem ++ = lpSrceMem[nCenterMemLoc + nSY + nSX];			nX += 256;		}		lpDestMem = &lpDestMem[lDestPelsLeft];	}	lpSrce->Unlock(NULL);	lpDest->Unlock(NULL);	return true;}        


Thanks for all your help and suggestions.

--------------------------
Carpe Diem

Edited by - DeltaVee on July 6, 2000 7:41:58 PM
D.V.Carpe Diem

This topic is closed to new replies.

Advertisement