Advertisement

Performance problem with reflections

Started by January 21, 2002 01:57 PM
39 comments, last by Kirkbag 23 years ago
quote:
Original post by Anonymous Poster
No, it''s the general case ! I couldn''t believe it either. Might be better on GeForce3, though. Have a look at http://developer.nvidia.com/view.asp?IO=ogl_performance_faq
Section 3 (Texture Coordinate Generation), point 19: On a GF2, you loose around 50% throughput when activating GL_OBJECT_LINEAR, around 62% with GL_EYE_LINEAR and you loose over 85% with GL_SPHERE_MAP ! Better supply the UV coords yourself and fiddle around with the texture matrix. Though there is also a performance hit if you use a non-identity texture matrix: around 38% loss.



Hehe, vincoof and the anonymous poster. It doesn''t count where I see you I always learn from you )
Anyway true, if stencil buffer is made in software it is around 0.1fps. Just see my apocalypse demo in 16bits/color Actually I''ve recently learnt that it is not important to switch to 32bits/color, the main thing is that depth buffer+stencil buffer should be 32bits/pixel. So a 16bits/color, 24bits depth buffer and 8bits stencil buffer is also made in hardware...




-- tSG --
-- tSG --
You''re welcome.

To my mind, 32-bit per color is really a great step.
Unless your screen displays 16bpp, you''ll see the diff.
Look at this picture :
http://qvraytk.free.fr/qvr/Images/source2.jpg
Try to switch your screen res between 16bpp and 32bpp (don''t forget to restart the browser every time).
Advertisement
Hmm, I think I really need to register to get my credits around here...
quote:
Original post by Anonymous Poster
I don''t think that he is in software mode either. With a Celeron 433 he would *never* get 20FPS with 6 stenciled alphablended reflections using M$ software OGL... It would be more around 2 FPS or less.



Im not talking software opengl, I said software stencil buffer. nVidia cards only have a hardware stencil buffer when they are running in 32bit colour. If they run in 16bit colour and require stencil, it uses a software stencil buffer, NOT software opengl. It explains perfectly why his fps isnt as low as software opengl nor as high as it would be with hardware stencil.
-----------------------"When I have a problem on an Nvidia, I assume that it is my fault. With anyone else's drivers, I assume it is their fault" - John Carmack
quote:

Im not talking software opengl, I said software stencil buffer. nVidia cards only have a hardware stencil buffer when they are running in 32bit colour. If they run in 16bit colour and require stencil, it uses a software stencil buffer, NOT software opengl. It explains perfectly why his fps isnt as low as software opengl nor as high as it would be with hardware stencil.



Doesn''t work that way. It''s either SW or HW, no mix at fragment level. You can''t just plug in a software stencil buffer into the hardware pixel pipeline You can have HW/SW mixes before vertex processing (eg. SW vertex shaders on GF2), but that''s not possible at fragment level, how would a SW stencil buffer tell the hardware not to draw certain pixels based on a stencil compare ? If you don''t have HW stenciling, it will revert to software mode, *full* software mode.
Well look at it this way, I can do 20fps with a software stencil buffer yet still have 8x anistropic filtering and 4xfsaa enabled running Quake2 at 1024x768. If I run in 32bit colour I do a flat 83fps which doesnt budge due to having an fps cap set. If I run through software ogl, I dont even get close to 20fps.

Its common knowledge that nvidia cards use a software stencil buffer when they are in 16bit colour mode. If you wish to dictate how it all works, fine by me, just dont be too shocked when you find out that what I said is true and you were ignorant of the fact.

edit: Just to clarify, its a modified version of Quake2, using the stencil buffer to clean up planar shadows (and used in shadow volume code Im working on in it)

Edited by - Maximus on January 24, 2002 9:13:18 AM
-----------------------"When I have a problem on an Nvidia, I assume that it is my fault. With anyone else's drivers, I assume it is their fault" - John Carmack
Advertisement
my understanding of the situation is that the graphics card driver has to support the full opengl specification but the hardware does not have to hardware accelerate every function. The driver either calls the hardware or does the software routines required for a particular opengl call. If I remember right this is from the opengl white book (OpenGL programming for windows 95 by Ron Fosner)

As a slight aside I was doing some CSG code a while back on the goldfeather algorithm that has intense use of the stencil buffer. Running this on a TNT2 Ultra in 16bpp gave about 1 fps but in 32bpp it was about 10fps.

Querying the current vendor for the set screen mode returned nvidia opengl driver and not microsoft opengl driver so it was being run by the hardware driver from nvidia. This clearly shows that 16bpp stencil buffer is either not optimized or not in hardware.
Look, Maximus, you should perhaps go to nVidia''s page and have a look at some technical papers they have about their hardware. Perhaps this will give you a bit more insight into the internal working of a hardware OpenGL pipeline.

a) You *CANNOT* plug a software stencil buffer into a hardware pixel pipeline. It''s physically impossible. You would have to open the GPU, and connect your CPU to the internal pipeline for that... There are various tricks to *simulate* a stencil buffer, such as using alphatest or the mentioned zbuffer mask. But, if you ask for a unsupported stencil buffer format in the pixelformat, then Windows will return a *non* accelerated one. Once a vertex / face has left the driver to the 3D card, the CPU loses control of it, the GPU takes over. The next time it will see it again, is rendered as a pixel in the framebuffer. It cannot interfere with the GPU in any other way (except with vertex/fragment shaders, buit that''s something else, and they also run on the GPU).

b) ''8x anistropic'': Nvidia cards have a maximum anisotropic level of 2.
c) nVidia cards always use a 8bit stencil buffer along with a 24bit zbuffer as a 32bit packed pixel format, regardless of your graphics mode (16 or 32 bit). It''s hardwired on the GPU.

And also as a side note: the standard Q2 does not use the stencil buffer.

My advice: have a look at nVidia''s documents.
quote:

my understanding of the situation is that the graphics card driver has to support the full opengl specification but the hardware does not have to hardware accelerate every function. The driver either calls the hardware or does the software routines required for a particular opengl call.


That''s correct. If you activate stenciling, and your HW doesn''t support it, then the driver will use a software renderer to get your triangle on the screen. Same as with, eg. the accumulation buffer.

quote:

As a slight aside I was doing some CSG code a while back on the goldfeather algorithm that has intense use of the stencil buffer. Running this on a TNT2 Ultra in 16bpp gave about 1 fps but in 32bpp it was about 10fps.


Could be, that the TNT2 didn''t support the 16bit rgb + stencil combination and reverts to software mode. GeForce supports it.
quote:

Querying the current vendor for the set screen mode returned nvidia opengl driver and not microsoft opengl driver so it was being run by the hardware driver from nvidia.


Of course. It is this exact hardware driver that will call the software renderer, if a particular function is not hardware supported. Calling GL_VENDOR will *always* give you the hardware info, since it''s independed of current pixelformat or features.


Thanks, but my point with the last bit about the vendor string was to prove that I was correctly using the nvidia opengl driver and not the microsoft software drivers. By highlighting the point about the csg being very slow in 16bpp on the tnt2 hardware, this shows that the speed decrease is due to the nvidia driver doing the stencil in software. The point being that this is the same hardware that the original poster is actually using. The geforce does indeed hardware accelerate a 16bpp stencil buffer as you say.

This topic is closed to new replies.

Advertisement