Particles
I have made a particle system demo. It is quite simple (it consists only of lines), but it is a killer for system (it has 150 000 lines).
It works quite fine on P3 systems with GeForce (about 15fps), but I got terrible results on Athlon (Athlon 700, 128MB PC133 ram, GeForce DDR does 1.7fps... And my P2 333 does 2fps...). I have found only two possible bottlenecks:
1. When copying vertex data to GeForce
2. Linked list speed
Here is a code, that I use for vertex copy (if it helps):
float *ptr;
ptr[0]=ptr[3]=Curr->Pos.x;
ptr[1]=ptr[4]=Curr->Pos.y;
ptr[2]=ptr[5]=Curr->Pos.z;
ptr[8]=ptr[11]=Curr->Pos.x+=Curr->Direction.x;
ptr[9]=ptr[12]=Curr->Pos.y+=Curr->Direction.y;
ptr[10]=ptr[13]=Curr->Pos.z+=Curr->Direction.z;
float *ptr;CPosition& pos = Curr->Pos;float &x, &y, &zx = pos.x;y = pos.y;z = pos.z;CDirection dir = Curr->Dir;ptr[0]=ptr[3]=x;ptr[1]=ptr[4]=y;ptr[2]=ptr[5]=z;ptr[8]=ptr[11]=x+=dir.x;ptr[9]=ptr[12]=y+=dir.y;ptr[10]=ptr[13]=z+=dir.z;
If that's really your bottleneck, this should help.
#pragma DWIM // Do What I Mean!
~ Mad Keith ~
Edited by - MadKeithV on June 6, 2000 1:34:33 PM
It's only funny 'till someone gets hurt.And then it's just hilarious.Unless it's you.
Thanks MadKeithV, while these function still does no good on Athlon system it adds some speed on Intel... Although i''m still not sure why this happends...
just a simple question, what compiler and what settings are you using to generate the code? Maybe you are taking advantage of the PIII''s enhanced SSE or something and it''s not kicking in on the other systems...
http://www.ill-lusion.com
http://www.ill-lusion.com
laxdigital.com
[email=ziggy@laxdigital.com]ziggy@laxdigital.com[/email]
[email=ziggy@laxdigital.com]ziggy@laxdigital.com[/email]
The reason for the speed up of MadKeithV''s optimization is because you access the pointer from the structure only once and save it in a normal variable (usually a register) which is extremely fast, and when you do all the assignments and calculations it uses that other variable and not the pointer to access it, hence the speed up on some computers... it''s gonna speed it up a lot especially if you''ve placed the above code in a for loop(either nested or a single for loop)....
anyways... hope this helps a bit...
..-=ViKtOr=-..
anyways... hope this helps a bit...
..-=ViKtOr=-..
I don''t know alot about this but maybe this will put you on the right track.
First, how are you implementing the hardware acceleration? Are you using D3D, OpenGL, or the Geforce hardware drivers?
Second, and this only applies of you are using Direct 3D, are you batching your calls to the hardware properly? You should implement a vertex buffer and put all the data in that and apply the appropriate operations on that that structure. D3D is much faster if you batch your calls.
Thats all I can think of other that the fact that you are trying to render 150000 particles at a frame rate of 60(I assume). I havent done the math but that I probably the limit of the acceleration.
Creativity -- Concept -- Code
Your game is nothing if you don't have all three.
First, how are you implementing the hardware acceleration? Are you using D3D, OpenGL, or the Geforce hardware drivers?
Second, and this only applies of you are using Direct 3D, are you batching your calls to the hardware properly? You should implement a vertex buffer and put all the data in that and apply the appropriate operations on that that structure. D3D is much faster if you batch your calls.
Thats all I can think of other that the fact that you are trying to render 150000 particles at a frame rate of 60(I assume). I havent done the math but that I probably the limit of the acceleration.
Creativity -- Concept -- Code
Your game is nothing if you don't have all three.
Creativity -- Concept -- CodeYour game is nothing if you don't have all three.http://www.wam.umd.edu/~dlg/terc.htm
June 12, 2000 01:03 AM
I use D3D. I also use one vertex buffer which contains 1024 vertices. ptr is pointer to that vertex buffer. Vertex buffer is created with D3DVBCAPS_WRITEONLY (on hardware T&L). Vertex buffer is locked with DDLOCK_WRITEONLY / DDLOCK_DISCARDCONTENTS flags. I have disabled drawing and i got 15fps. I disabled sending and go 30fps on P2 333.
I think that 15fps is NOT the best that hardware can take.
Well the demo is at http://www2.arnes.si/~mdolen/Storm3D.zip if you would like to see it.
I think that 15fps is NOT the best that hardware can take.
Well the demo is at http://www2.arnes.si/~mdolen/Storm3D.zip if you would like to see it.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement