Performance problem with reflections
Hello there. i have a pretty annoying performance hit (only 18-20 FPS) when i use reflections (6 reflections of two rectangle boxes and a sphere). i am using the stencil buffer (which is supposed to b free of performance hits according 2 RIVA TNT2 documentation damn it!!). i''m developing using an INTEL Celeron (yeah.. haha) 433MHZ, 192MB SDRAM, RIVA TNT2 16MB, Windows 98. many games run much better at 800x600x32 than this piece of crap i created. plz help me! what am i doing wrong?!
Alpha blending (reflections) is a HUGE performance hit. Stencil buffer might take one of your texture calculation thingy''s, so if you have a couple of textures, that will be a HUGE performance hit too.
Check out devel.nvidia.com and look at their opengl performance articles.
Nitzan
-------------------------
www.geocities.com/nitzanw
www.scorchedearth3d.net
-------------------------
Check out devel.nvidia.com and look at their opengl performance articles.
Nitzan
-------------------------
www.geocities.com/nitzanw
www.scorchedearth3d.net
-------------------------
Make sure your app is running in 32bit colour, else youll be using a software stencil buffer which is SLOW! It sounds exactly like the problem youre having.
-----------------------"When I have a problem on an Nvidia, I assume that it is my fault. With anyone else's drivers, I assume it is their fault" - John Carmack
January 23, 2002 11:40 AM
quote:
Alpha blending (reflections) is a HUGE performance hit. Stencil buffer might take one of your texture calculation thingy''s, so if you have a couple of textures, that will be a HUGE performance hit too.
I think, you got some things mixed up there. Alpha blending is not a huge performance hit, it doesn''t affect vertex transformation speed at all, only fillrate. This shouldn''t be critical, if you''re not fillrate limited, obviously the original poster is not.
Stencil buffering has nothing to do with texture units. User defined clipplanes and polygon stippling will eat a texture unit, but stenciling comes almost for free on TNT2 and GF class hardware.
I don''t think that he is in software mode either. With a Celeron 433 he would *never* get 20FPS with 6 stenciled alphablended reflections using M$ software OGL... It would be more around 2 FPS or less.
whoever you are, Mr Anonymous Poster, you''re right.
Last year, I benchmarked a reflection example (very simple example) on a PII-400MHz with a Vodoo3. Obviously, stencil buffer was software because the application ran less than 1 frame per second. When the stencil buffer was disabled, the application ran about 60 fps !
For sure, you would never get 20 fps for 6 reflections if the stencil were software.
Last year, I benchmarked a reflection example (very simple example) on a PII-400MHz with a Vodoo3. Obviously, stencil buffer was software because the application ran less than 1 frame per second. When the stencil buffer was disabled, the application ran about 60 fps !
For sure, you would never get 20 fps for 6 reflections if the stencil were software.
January 23, 2002 01:20 PM
> whoever you are, Mr Anonymous Poster, you''re right.
Hehe, c''est moi..
If his 3D card doesn''t have stancil acceleration, he could still use my cool glDepthRange(0,0) method discussed elsewhere in this forum
But a TNT2 surely has HW stencil support, I''m pretty sure about that. The bottleneck must be somewhere else. Though, I strongly recommend the OGL performance paper Nitzan pointed out. It is very usefull, if you target nVidia boards, and want to optimize your application to the maximum. ie. did you know, that using glTexGen() can cut your triangle throughput performance down to 25% or less !? Now I stay away from glTexGen...
Hehe, c''est moi..

If his 3D card doesn''t have stancil acceleration, he could still use my cool glDepthRange(0,0) method discussed elsewhere in this forum

But a TNT2 surely has HW stencil support, I''m pretty sure about that. The bottleneck must be somewhere else. Though, I strongly recommend the OGL performance paper Nitzan pointed out. It is very usefull, if you target nVidia boards, and want to optimize your application to the maximum. ie. did you know, that using glTexGen() can cut your triangle throughput performance down to 25% or less !? Now I stay away from glTexGen...
Are you clearing the stencil-buffer after rendering each reflection?
That might result a speed loss, however i think it is dependant on your resolution. 6 stencil-clears in resolutions such as 1024x768 isn''t very fast.
You can lose most of the clears just by using more bits for the stencil-buffer. Then, when rendering a reflection to the stencil buffer, use GL_REPLACE as your stencil operation and a different stencil value for each reflecting object.
IN SHORT
For each reflecting object:
1. Set stencil operation as GL_REPLACE
2. Choose a unique value to be put in the stencil-buffer
3. Render the reflecting object to set the stencil (stencil function GL_ALWAYS)
4. Render mirrored geometry using stencil function GL_EQUAL
5. Render the reflecting object into the colorbuffer (optional)
Hope this helps.
--BerLan
That might result a speed loss, however i think it is dependant on your resolution. 6 stencil-clears in resolutions such as 1024x768 isn''t very fast.
You can lose most of the clears just by using more bits for the stencil-buffer. Then, when rendering a reflection to the stencil buffer, use GL_REPLACE as your stencil operation and a different stencil value for each reflecting object.
IN SHORT
For each reflecting object:
1. Set stencil operation as GL_REPLACE
2. Choose a unique value to be put in the stencil-buffer
3. Render the reflecting object to set the stencil (stencil function GL_ALWAYS)
4. Render mirrored geometry using stencil function GL_EQUAL
5. Render the reflecting object into the colorbuffer (optional)
Hope this helps.
--BerLan
--BerLan
> Hehe, c''est moi..
LOL !!!
Tu devrais te créer un compte sur gamedev. "It''s fast, it''s free", et puis ça me permettrait de te reconnaître plus facilement
The depth range thingy might work. That''s true.
There''s a little bit extra stuff to do, but it may work very fast on graphics card that do no support hardware stenciling.
Though, as TNT2 seems to have hardware stenciling, there''s no _need_ to use this depth range thingy, right ?
25% on glTexGen ?! I can''t believe it. My teachers used to tell me to use that function to boost up texturing !!
You must be speaking of a particular case (like switching texture equations between every triangle). Otherwise it''s not logical.
LOL !!!
Tu devrais te créer un compte sur gamedev. "It''s fast, it''s free", et puis ça me permettrait de te reconnaître plus facilement

The depth range thingy might work. That''s true.
There''s a little bit extra stuff to do, but it may work very fast on graphics card that do no support hardware stenciling.
Though, as TNT2 seems to have hardware stenciling, there''s no _need_ to use this depth range thingy, right ?

25% on glTexGen ?! I can''t believe it. My teachers used to tell me to use that function to boost up texturing !!
You must be speaking of a particular case (like switching texture equations between every triangle). Otherwise it''s not logical.
January 23, 2002 02:04 PM
quote:
Are you clearing the stencil-buffer after rendering each reflection?
That might result a speed loss, however i think it is dependant on your resolution. 6 stencil-clears in resolutions such as 1024x768 isn''t very fast.
That''s true, especially on cards that use packed pixel framebuffers (24 bit zbuffer, interleaved with 8 bit stencil). nVidia boards do that. Clears are very fast, if you clear both the zbuffer and the stencil buffer (well aligned access on 32bit boundaries, straight 0 fill), but it''s slow if you clearthe stencil buffer alone.
quote:
25% on glTexGen ?! I can''t believe it. My teachers used to tell me to use that function to boost up texturing !!
You must be speaking of a particular case (like switching texture equations between every triangle). Otherwise it''s not logical.
No, it''s the general case ! I couldn''t believe it either. Might be better on GeForce3, though. Have a look at http://developer.nvidia.com/view.asp?IO=ogl_performance_faq
Section 3 (Texture Coordinate Generation), point 19: On a GF2, you loose around 50% throughput when activating GL_OBJECT_LINEAR, around 62% with GL_EYE_LINEAR and you loose over 85% with GL_SPHERE_MAP ! Better supply the UV coords yourself and fiddle around with the texture matrix. Though there is also a performance hit if you use a non-identity texture matrix: around 38% loss.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement