Performance Question
I am using an Nvidia Geforce 256 DDR. The white paper from Nvidia claims my card can do 15,000,000 triangles/second. But the programs I write seem stuck at around 320,000 triangles/second. I have a friend who works for nvidia (in direct3d though) and he claims that I should be able to get pretty close to 15,000,000 triangles/second. Has anyone been able to achieve this ? These are supposed to be 15,000,000 textured+lighted triangles.
Nitzan
Well, i''ve already reached 19M tri/sec (textured, not lit) on my GF2. Granted, it was on a pretty well optimized model and the application wasn''t doing anything except rendering (no advanced visibility system or else).
15M Tri/sec textured+lit seem pretty optimistic in a real world application, but i think it''s possible to reach 7-8M without too much trouble. You must make sure to not be fillrate or CPU limited, and to use the VAR extension (needed to reach high trirates on GeForce series). Hope that helps.
Y.
15M Tri/sec textured+lit seem pretty optimistic in a real world application, but i think it''s possible to reach 7-8M without too much trouble. You must make sure to not be fillrate or CPU limited, and to use the VAR extension (needed to reach high trirates on GeForce series). Hope that helps.
Y.
VAR refers to vertex arrays. Strictly speaking, I believe it is supposed to stand for the NV_vertex_array_range extension. But people sloppily lump it together and refer to regular vertex arrays and compiled vertex arrays by the same acronym.
And yes that should help you pump more tri''s through the pipeline. Read up on the OpenGL performance FAQ provided by Nvidia at www.nvidia.com/developer.
The number they quote for throughput would probably be with everything turned off except for lighting and texturing. So turn off alpha, depth, etc. And you probably only want to enable GL_LIGHT0.
So if you''re using glVertex* you''ll probably be application limited. Meaning not able to send enough geometry fast enough through the pipeline.
If you have a whole bunch of lights on and doing other tests throughout the pipeline, you''ll be transform-limited.
SL
And yes that should help you pump more tri''s through the pipeline. Read up on the OpenGL performance FAQ provided by Nvidia at www.nvidia.com/developer.
The number they quote for throughput would probably be with everything turned off except for lighting and texturing. So turn off alpha, depth, etc. And you probably only want to enable GL_LIGHT0.
So if you''re using glVertex* you''ll probably be application limited. Meaning not able to send enough geometry fast enough through the pipeline.
If you have a whole bunch of lights on and doing other tests throughout the pipeline, you''ll be transform-limited.
SL
Well, if you''ve got a bunch of lights, i''d rather say you''re light-limited 
As for acronyms, there is a little confusion.
VA stands for Vertex Arrays. These are the usual vertex arrays in the OpenGL 1.1 core.
CVA stands for Compiled Vertex Arrays. This is an extensions available on many recent, and not-so-recent video cards, to avoid retransforming the same vertex array over multiple passes, or when a vertex is being reused in many triangles. It is now mostly useless with T&L ( Transform & Lighting, again acronyms.. grrr ) cards.
VAR stands for Vertex Array Range. This is a Nvidia specific extension that enable you to store your geometry in video or AGP memory, hence reducing the need by the driver to duplicate the data and to send it over the bus to the video card. With this extension you''re able to reach very high triangle rates, because you''re no longer limited by the overhead in the driver.
Hope it''s less confused now
By the way, the number i gave was with textured, colored vertices, depth test enabled. No blending, no alpha test.
Y.

As for acronyms, there is a little confusion.
VA stands for Vertex Arrays. These are the usual vertex arrays in the OpenGL 1.1 core.
CVA stands for Compiled Vertex Arrays. This is an extensions available on many recent, and not-so-recent video cards, to avoid retransforming the same vertex array over multiple passes, or when a vertex is being reused in many triangles. It is now mostly useless with T&L ( Transform & Lighting, again acronyms.. grrr ) cards.
VAR stands for Vertex Array Range. This is a Nvidia specific extension that enable you to store your geometry in video or AGP memory, hence reducing the need by the driver to duplicate the data and to send it over the bus to the video card. With this extension you''re able to reach very high triangle rates, because you''re no longer limited by the overhead in the driver.
Hope it''s less confused now

By the way, the number i gave was with textured, colored vertices, depth test enabled. No blending, no alpha test.
Y.
one of the big issues with reaching the maximum performance on you video card is the speed of your processor. while most of the operations are done on your video card the processor makes a big difference with some aspects and unless you have a fast enough processor to go with a fast video card you will be very limited.
i know from experience with my SDR Radeon on a PII 450. even the box for my video card says in the requirements "Pentium III, Athlon, or compatible system, etc, etc."
wow everyone always ignores you don't they...
Edited by - HelplessFool on October 28, 2001 4:38:23 PM
i know from experience with my SDR Radeon on a PII 450. even the box for my video card says in the requirements "Pentium III, Athlon, or compatible system, etc, etc."
wow everyone always ignores you don't they...
Edited by - HelplessFool on October 28, 2001 4:38:23 PM
October 28, 2001 12:12 AM
Please note a little detail:
The speed in triangles per second that nVidia gives for it''s cards is the maximum *transform rate*, not including filling: that means that it is actually measured with 0-area triangles. You will never reach this performance in real-world applications
The speed in triangles per second that nVidia gives for it''s cards is the maximum *transform rate*, not including filling: that means that it is actually measured with 0-area triangles. You will never reach this performance in real-world applications
I have a friend who works at Nvidia and he claims they have some very nice demos running at 15,000,000 triangles/second on Geforce 256 cards. He said that my openGL program should be able to run at very close to 15,000,000 triangles/second without too much problems. Unfortunately he works on direct3d drivers so he couldnt give me any first-hand help with my game.
-------------------------
www.geocities.com/nitzanw
www.scorchedearth3d.net
-------------------------
-------------------------
www.geocities.com/nitzanw
www.scorchedearth3d.net
-------------------------
GeForce+ cards have a pretty high fillrate power. It is likely you''ll hit the transform limit first if you do not have a high overdraw, and do not use alpha-blending. As i said, i have already reached 19M tri/sec, and it was including fill rate.
Y.
Y.
So I did some research and tried some stuff out. It turns out I am screwed becuase my terrain is deformable. To get 15,000,000 triangles/second I would need to use the nvidia specific calls and allocated memory for the vertices on the nvidia card (wglAllocateMemoryNV and vertex_array_range).
Normal vertex arrays using DrawElemenets (the fastest function) are just a little bit faster then normal drawing routines (vertex3f, color3f, etc...).
The only way to make vertex arrays fast is to lock them, and even then its slower then display lists. Not to mention that just the color part of my terrain drops my performance from 750,000 triangles/second to 450,000 triangles/second. Luckily it appears that texture mapping and color are pipelined so together the performance stays the same.
Well thats it. I got all my info from http://developer.nvidia.com/view.asp?IO=ogl_performance_faq
if any of you are interested in reading it.
-------------------------
www.geocities.com/nitzanw
www.scorchedearth3d.net
-------------------------
Normal vertex arrays using DrawElemenets (the fastest function) are just a little bit faster then normal drawing routines (vertex3f, color3f, etc...).
The only way to make vertex arrays fast is to lock them, and even then its slower then display lists. Not to mention that just the color part of my terrain drops my performance from 750,000 triangles/second to 450,000 triangles/second. Luckily it appears that texture mapping and color are pipelined so together the performance stays the same.
Well thats it. I got all my info from http://developer.nvidia.com/view.asp?IO=ogl_performance_faq
if any of you are interested in reading it.
-------------------------
www.geocities.com/nitzanw
www.scorchedearth3d.net
-------------------------
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement