Incrementing Pointers

April 29, 2000 01:30 AM

i said:

"this would fill the array with zeros, but I would recommend using p, or even better, memcpy() [or CopyMemory, which is the same thing]"

the p is supposed to be p
i think

i absolutely cant stand how this board screws up my text
>:-[

there should be a switch for if you want plaintext or html

adammil

122

April 29, 2000 01:32 AM

gaaahhh! it did it again!

it''s supposed to be p bracket i bracket
there

adamm@san.rr.com

Zipster

2,420

April 29, 2000 01:36 AM

Holy shit, adammil has ALOT of free time

.

ziplux

Author

122

April 29, 2000 08:49 AM

Hi again. I think I understand it now. All I wanted to do was make my alphablending code more efficent, so I''ll post the main loop before I changed it:

	for (index_y = 0; index_y < height; index_y++)		{		for (index_x = 0; index_x < width; index_x++)			{			if((svidbuffer[index_x + sx + slPitch16 * (index_y + sy)] != colorkey))				{				dindex = index_x + dx + dlPitch16 * (index_y + dy);				sindex = index_x + sx + slPitch16 * (index_y + sy);				sred = (((svidbuffer[sindex] >> 11) & 0x1F));				sgreen = (((svidbuffer[sindex] >> 5) & 0x3F));				sblue = ((svidbuffer[sindex]) & 0x1F);								dred = (((dvidbuffer[dindex] >> 11) & 0x1F));				dgreen = (((dvidbuffer[dindex] >> 5) & 0x3F));				dblue = ((dvidbuffer[dindex]) & 0x1F);				fred = ((l_Alpha[alpha][sred] - l_Alpha[alpha][dred])) + dred;  // same as (alpha * sred - alpha * dred) + dred				fgreen = ((l_Alpha[alpha][sgreen] - l_Alpha[alpha][dgreen])) + dgreen;				fblue = ((l_Alpha[alpha][sblue] - l_Alpha[alpha][dblue])) + dblue;				dvidbuffer[dindex] = RGB16(fred << 3,fgreen << 2,fblue << 3);  // scale up the values				}			}                 }

and the new version incrementing pointers:

	for (index_y = 0; index_y < height; index_y++)		{		for (index_x = 0; index_x < width; index_x++)			{			dindex = index_x + dx + dlPitch16 * (index_y + dy);			sindex = index_x + sx + slPitch16 * (index_y + sy);			if((*(USHORT*)svidbuffer != colorkey))				{				sred = (((*(USHORT*)svidbuffer >> 11) & 0x1F));				sgreen = (((*(USHORT*)svidbuffer >> 5) & 0x3F));				sblue = (*(USHORT*)svidbuffer) & 0x1F);								dred = (((*(USHORT*)dvidbuffer >> 11) & 0x1F));				dgreen = (((*(USHORT*)dvidbuffer >> 5) & 0x3F));				dblue = ((*(USHORT*)dvidbuffer) & 0x1F);				fred = ((l_Alpha[alpha][sred] - l_Alpha[alpha][dred])) + dred;  // same as (alpha * sred - alpha * dred) + dred				fgreen = ((l_Alpha[alpha][sgreen] - l_Alpha[alpha][dgreen])) + dgreen;				fblue = ((l_Alpha[alpha][sblue] - l_Alpha[alpha][dblue])) + dblue;				*(USHORT*)dvidbuffer = RGB16(fred << 3,fgreen << 2,fblue << 3);  // scale up the values				}			svidbuffer += sindex;			dvidbuffer += dindex;			}		}

if this is right, one more question. are while loops faster than for loops? Thanks in advance.

Visit our web site:
Asylum Entertainment

My Geekcode: "GCS d s: a14 C++$ P+(++) L+ E-- W+++$ K- w++(+++) O---- M-- Y-- PGP- t XR- tv+ b++ DI+(+++) D- G e* h!"Decode my geekcode!Geekcode.com
Visit our web site:Asylum Entertainment

ziplux

Author

122

April 29, 2000 08:55 AM

That must not be right because it just crashes when I try to run the program. Is it the typecast I''m doing? I have to go now, I''ll look into it more later.

Visit our web site:
Asylum Entertainment

My Geekcode: "GCS d s: a14 C++$ P+(++) L+ E-- W+++$ K- w++(+++) O---- M-- Y-- PGP- t XR- tv+ b++ DI+(+++) D- G e* h!"Decode my geekcode!Geekcode.com
Visit our web site:Asylum Entertainment

adammil

122

April 29, 2000 01:00 PM

Hehe.. yeah I do spend a bit too long typing my posts

While loops arent intrinsically faster than for loops.

for(i=0;i<100;i++)
{
code;
}

is equivalent to this:

i=0;
while(i<100)
{
code;
i++;
}
-except- that if you declare a variable in the init part of the for loop, it goes out of scope after the loop:

for(int i=0;i<100;i++)
{
code;
}

is like this:

int i=0;
while(i<100)
{
code;
i++;
}
// i goes out of scope here

unfortunately, VC6 doesnt work this way, it uses the obsolete for scoping..

BUT.. if you are counting down to zero in a for loop it is faster than counting up to something.
For example:

for(i=100;i;i--);
is probably faster than
for(i=0;i<100;i++);

here''s the main part of my alpha blender (the 16-bit part anyway)

it''s fairly fast.. as long as you''re not using it for large parts of the screen on every frame, it''s perfectly fine..
It works great for alpha blending particles or special effects.. Or for making your health bar transparent. Or for one time, full-screen alpha blends, it''s alright too.
the lookup table is nice and saves all multiplies and divides

but what it really needs is a good assembly re-write... that would probably double the speed..
unfortunately, i have like 30k of alpha blending code :/
(all different kinds of alpha blends that handle all different bit depths.. etc..) and I have more important things to do!

I''m embarrassed by this code; it''s the worst looking code in the whole project!

It bears a comment warning all who might gaze upon it and be corrupted..

for(y=0;y {
for(x=0;x {
mul = amem[x] * 512 + 256; // *512 is done with a shift
swpixel = *((WORD*)smem+x);
sr = (swpixel >> 11), sg = (swpixel >> 5) & 0x3F, sb = swpixel & 0x1F;
dwpixel = *((WORD*)dmem+x);
dr = (dwpixel >> 11), dg = (dwpixel >> 5) & 0x3F, db = dwpixel & 0x1F;
r = dr+MulTable[mul+(sr-dr)];
g = dg+MulTable[mul+(sg-dg)];
b = db+MulTable[mul+(sb-db)];
*((WORD *)dmem+x) = (r<<11) / (g<<5) / b;
}
dmem += dpitch;
smem += spitch;
amem += apitch;
}

but yours wouldn''t work for a couple reasons..

mainly, you are not changing sVidBuffer. it is reading and writing the same pixel (the beginning of the line) for the entire width of the row.. and then moving to the next row.
So it would alpha blend only the first column of pixels in the rectangle you are working with.

next.. i warned you about calculating y*pitch+x for every pixel, which is the reason people told you to use a pointer in the first place, and that''s exactly what you''re doing

dindex = index_x + dx + dlPitch16 * (index_y + dy);
sindex = index_x + sx + slPitch16 * (index_y + sy);

that''s inside the main loop!

so it is recalculating sindex and dindex for every pixel.. that will KILL the speed!

I have to go, but try something like this instead:

BYTE *srcPtr = src.lpSurface + srcY*src.Pitch + srcX;
BYTE *destPtr = dest.lpSurface + destY*dest.Pitch + destX;

for(y=0;y{
for(x=0;x{
do alpha blend on *((WORD *)srcPtr + x) and *((WORD *)destPtr + x)
notice how I didn''t read the pixel over and over, I only read the pixel once and stored it in spixel. it''s faster to read spixel than to keep re-evaluating *((WORD *)srcPtr + x)
plus, the compiler could probably keep spixel in a register
}
srcPtr += srcPitch;
destPtr += destPitch;
}

adamm@san.rr.com

adammil

122

April 29, 2000 01:04 PM

d''oh!

stupid board!

all those things like this:
for(y=0;y{
for(x=0;x{

were supposed to be like:

for(y=0;y<_hei;y++)
{
for(x=0;x<_wid;x++)

except without the _

[and it''d better work time!]

adamm@san.rr.com

ziplux

Author

122

April 29, 2000 02:13 PM

Ok, I think I get it now, but I rewrote the function, and it no longer crashes, but it doesn''t work properly. It looks like the pitch is off or something, so here''s the whole function:

void Blend16(int sx,int sy, int width, int height, int dx, int dy, int alpha, int slPitch16, int dlPitch16, USHORT colorkey, USHORT* svidbuffer, USHORT* dvidbuffer, int pixel_format)	{	// To Do:  Rewrite this function to work with 5.5.5 cards	int index_x, index_y;	int dindex, sindex;	UCHAR sred,sgreen,sblue,dred,dgreen,dblue,fred,fgreen,fblue;		dindex = dx + dlPitch16 * dy;		sindex = sx + slPitch16 * sy;		svidbuffer += sindex;		dvidbuffer += dindex;		for (index_y = 0; index_y < height; index_y++)		{		for (index_x = 0; index_x < width; index_x++)			{				if((*(dvidbuffer + index_x + dx) != colorkey))				{								sred = (*(svidbuffer + index_x + dx) >> 11) & 0x1F;				sgreen = (*(svidbuffer + index_x + dx) >> 5) & 0x3F;				sblue = *(svidbuffer + index_x + dx) & 0x1F;								dred = (*(dvidbuffer + index_x + dx) >> 11) & 0x1F;				dgreen = (*(dvidbuffer + index_x + dx) >> 5) & 0x3F;				dblue = *(dvidbuffer + index_x + dx) & 0x1F;				fred = ((l_Alpha[alpha][sred] - l_Alpha[alpha][dred])) + dred;  // same as (alpha * sred - alpha * dred) + dred				fgreen = ((l_Alpha[alpha][sgreen] - l_Alpha[alpha][dgreen])) + dgreen;				fblue = ((l_Alpha[alpha][sblue] - l_Alpha[alpha][dblue])) + dblue;				*(dvidbuffer + index_x + dx) = RGB16(fred << 3,fgreen << 2,fblue << 3);  // scale up the values				}			}		svidbuffer += slPitch16;		dvidbuffer += dlPitch16;		}	}

and here''s the call to it:

				// lock dest surface (lpddswork) and source surface (lpddslogo)				dvidbuffer = Lock_Surface16(lpddswork, &dpitch);				svidbuffer = Lock_Surface16(lpddslogo, &spitch);				// draw logo with alpha transparency				Blend16(0,0,600,200,0,0,logofade, spitch >> 1, dpitch >> 1, RGB16(0,0,0), svidbuffer, dvidbuffer, g_pixformat);				Unlock_Surface(lpddslogo);				Unlock_Surface(lpddswork);

I must be doing something wrong, but it seems to me like I''ve done what you have suggested (but maybe I still don''t get it, I don''t know). Thanks for all your help.

Visit our web site:
Asylum Entertainment

My Geekcode: "GCS d s: a14 C++$ P+(++) L+ E-- W+++$ K- w++(+++) O---- M-- Y-- PGP- t XR- tv+ b++ DI+(+++) D- G e* h!"Decode my geekcode!Geekcode.com
Visit our web site:Asylum Entertainment

adammil

122

April 30, 2000 10:10 AM

I dont know if these are all the problems, but...

if((*(dvidbuffer + index_x + dx) != colorkey))
you already added dx to the vid buffer, now you''re adding it again!

[you do it on all the other lines as well]

(in the call)
spitch >> 1, dpitch >> 1

you cant safely assume that the pitch will be divisible by two.. but it generally always is (unless the surface width is odd and it''s created in system memory)

and this isnt a bug, but more of a problem with accessing surfaces directly.. directly writing and especially reading surfaces in video memory is EXTREMELY slow on almost every graphics card out there.. so you would definately only want to do alpha blending if both surfaces are in system memory.. unless if you are doing the alpha blend only once or something

plus it would be faster if you only read from the surfaces once, and stored the pixel in variables, like spixel and dpixel

fred = ((l_Alpha[alpha][sred] - l_Alpha[alpha][dred])) + dred;

also, it looks like you have a 2 dimensional lookup table.. and this means it takes an extra multiply and an add per lookup
so you''re adding 6 multiplies and 6 adds per pixel there
(or 6 shifts and adds per pixel, depending on the the size of the table..)

you should use a 1 dimensional lookup table and do the indexing yourself

adamm@san.rr.com

Kylotan

10,513

April 30, 2000 10:20 AM

quote: Original post by adammil

Hehe.. yeah I do spend a bit too long typing my posts

While loops arent intrinsically faster than for loops.

You may find that a do...while loop can be minimally faster though, since the check is after the loop rather than before it, saving a couple of jumps at the end. Obviously not all loops are appropriate to use do...while with, and the gain will be pretty negligible. You''d only really need to consider this in tight inner loops with 1 or 2 instructions inside them.

Incrementing Pointers

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Incrementing Pointers

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines