Advertisement

FPS drop

Started by March 21, 2000 12:57 PM
8 comments, last by Spark 24 years, 5 months ago
Hi, I''m trying to add the values of to large arrays like this: for (i = 0; i < XRES*YRES; i++) { fb1.r = fb1.r + fb2.r; fb1.g = fb1.g + fb2.g; fb1.b = fb1.b + fb2.b; } This is done once per frame and causes the FPS to drop from 42 to 31 FPS. It feels like a realy bad way of doing it, so do any of you now about anything that may speed it up? // Spark </i>
hmm, the fist line is supposed to look like the other ones (but with .r)

// Spark

Edited by - Spark on 3/21/00 1:02:09 PM
Advertisement
Spark, that''s just because the got interpretted as an italics tag.<br><br>I''m not positive this will give you better results (depends on how good your compiler is), but here''s a different way of doing it:<br><br>I don''t know why types fb1 and fb2 are, so I''m going to call them FBType.<br><br><pre><br>FBType *pfb1 = fb1;<br>const FBType *pfb2 = fb2;<br>const FBType * const pend = fb1 + (XRES*YRES); // points to end of array<br>while (pfb1 != pend)<br>{<br> pfb1->r += pfb2->r;<br> pfb1->g += pfb2->g;<br> pfb1->b += pfb2->b;<br> pfb1++;<br> pfb2++;<br>}<br> </pre> <br><br>Not sure you''ll get an improvement, but that''s at least another way to do it. I''m kinda gambling on the fact that it''s faster to move pointers along than to index into the array, which compilers SHOULD do, but maybe it''s confused because there are multiple fields?<br><br>You could also try using the register keyword for the pointers, but I''ve never used it so I can''t say if it would get you anything.<br><br>Hope this helps. Let me know. And you''re using a profiler, right?
Thanks for the reply Stoffel but it only causes a minimal speed increase. =/
It eats ups about 25% of the time spend calculating each frame so it''s must be optimized or removed in some way.

// Spark
Are you positive this is where the bottleneck is? Are you using a profiler tool to tell you this? Only reason I ask is that I can''t see any more efficient way to do the task you''ve presented.
I''m using a profiler and it spends about 23,2% of the time in that function.

// Spark
Advertisement
Consider what you''re doing. At 640x480, you''re doing multiple memory accesses on almost 310,000 elements. This is a cache-coherency nightmare.

In general you want to avoid doing anything on a per-pixel basis in a realtime situation.
Volition, Inc.
Also, if one (or both) of the arrays is in video memory you will get crappy performance no matter what you do. In general reads from video memory are to be avoided like the plague and if you are going to do reads on a surface it should NOT be in video ram.

If that doesn''t help try to think of another way to achieve the same effect, or post it up here so we can try and optimize the algorithm itself.

PreManDrake
That''s a definite for the video ram reply. If you are accessing the video ram (it looks like you are), try just making the whole array (or surface, if you''re using DX) in system memory. It''ll be slower when you go to blit, but definitely faster than manipulating many pixels such as the way you''re doing it in video memory.


ColdfireV
[email=jperegrine@customcall.com]ColdfireV[/email]
I''m trying to use lightmaps in software by first render the original textures and then the lightmaps into different arrays and in the end mix the colors of the lightmap array with the texture array. Both surfaces are in system memory and the resolution is 320x240. I can''t figure out any way to do this without adding the colors in the end. I coud of course mix the textures directly in the texturemaping function but that''s the same thing, I guess.

// Spark

This topic is closed to new replies.

Advertisement