Cost of gl functions
Heya, I am looking for some info on various gl function execution costs.
By this I mean on average which functions take longer to execute than others, so I can pinpoint any areas causing slowdowns and so on.
Also, if a gl driver does not support a particular command, will opengl simply not initialise to begin with, or will the commands not supported just be run via software mode? (thinking of 3dfx opengl drivers here)
Thnx in advance, Cel
Cel aka Razehttp://chopper2k.qgl.org
September 19, 2000 08:10 AM
Gday,
1. I''m not sure about the costs, but if you try to keep them to a minimum thats needed, thats probably best (ie. don''t use glBegin(GL_WHATEVER) and glEnd() more than is needed, draw polygons in order of textures to avoid unnecessary binding & unbinding of textures, ect). Also, keep in mind that most GL functions would tax the video card, not CPU.
2. GL will not emulate any functions using software mode (unless using a software driver). Direct X does do this, but it would probably be better if it ignored it like OpenGL does since it does it VERY SLOWLY which takes a huge unnecessary hit on performance rather than simply not displaying that element.
Hope this helps
1. I''m not sure about the costs, but if you try to keep them to a minimum thats needed, thats probably best (ie. don''t use glBegin(GL_WHATEVER) and glEnd() more than is needed, draw polygons in order of textures to avoid unnecessary binding & unbinding of textures, ect). Also, keep in mind that most GL functions would tax the video card, not CPU.
2. GL will not emulate any functions using software mode (unless using a software driver). Direct X does do this, but it would probably be better if it ignored it like OpenGL does since it does it VERY SLOWLY which takes a huge unnecessary hit on performance rather than simply not displaying that element.
Hope this helps
Cel, opengl asks that if you video card support opengl, it MUST support all the features of the implementation it is running on.
This simply means that the driver writer needs to emulate the gl functions that the card doesn''t support. If you compare opengl and Direct3d, in Direct3d the guys at microsoft are doing all the software implementation and so the driver writer only needs to implement the thing that are in hardware. This is not the case for opengl.
And for the implementation, on windows right now, the version of opengl is 1.1. So every feature of opengl 1.1 needs to be functionnal with the card.
So to sum this up, if a card doesn''t implement a feature in hardware, opengl won''t emulate, but the video driver will have to.
This simply means that the driver writer needs to emulate the gl functions that the card doesn''t support. If you compare opengl and Direct3d, in Direct3d the guys at microsoft are doing all the software implementation and so the driver writer only needs to implement the thing that are in hardware. This is not the case for opengl.
And for the implementation, on windows right now, the version of opengl is 1.1. So every feature of opengl 1.1 needs to be functionnal with the card.
So to sum this up, if a card doesn''t implement a feature in hardware, opengl won''t emulate, but the video driver will have to.
September 19, 2000 01:23 PM
the overhead of function calls is not always an issue on every system. but, there are optimization techniques you need to adopt regardless if it runs well on machine x or not.
use as few glBegin / glEnd statements as possible.
use display lists for static objects.
use vertex lists to cut down on uunneecceessaarryy, redundant vertices. this eliminates ALOT of function calls on it''s own.
there are many more ways to optimize your opengl implementations, but in regards to function calls the above 3 will limit the amount of calls drastically.
but to really get to the meat of your question, other than optimizing your code and benchmarking both version on the same machine i do not know how you can measure the impact each call makes unless you have a super timer of some sort. (Hey another function call haha)
just for grins, if you think that''s a pain remember the good old days (and i imagine still) of loop unrolling. talk about optimization, rather than iterate 1000 times through a loop people write programs to generate the code of the loop and literally hard code it into their app.
for (int i=0;i<1000;i++)
{
glPrint(ch);
}
becomes:
glPrint(ch[1]);
glPrint(ch[2]);
glPrint(ch[3]);
glPrint(ch[4]);
glPrint(ch[5]);
glPrint(ch[6]);
glPrint(ch[7]);
glPrint(ch[8]);
glPrint(ch[9]);
glPrint(ch[10]);
....
i''ve seen this in many commercial graphic applications. it''s amazing how much faster the second version executes. it doesn''t have to perform a check on i to see if it''s < 1000, it just executes
use as few glBegin / glEnd statements as possible.
use display lists for static objects.
use vertex lists to cut down on uunneecceessaarryy, redundant vertices. this eliminates ALOT of function calls on it''s own.
there are many more ways to optimize your opengl implementations, but in regards to function calls the above 3 will limit the amount of calls drastically.
but to really get to the meat of your question, other than optimizing your code and benchmarking both version on the same machine i do not know how you can measure the impact each call makes unless you have a super timer of some sort. (Hey another function call haha)
just for grins, if you think that''s a pain remember the good old days (and i imagine still) of loop unrolling. talk about optimization, rather than iterate 1000 times through a loop people write programs to generate the code of the loop and literally hard code it into their app.
for (int i=0;i<1000;i++)
{
glPrint(ch);
}
becomes:
glPrint(ch[1]);
glPrint(ch[2]);
glPrint(ch[3]);
glPrint(ch[4]);
glPrint(ch[5]);
glPrint(ch[6]);
glPrint(ch[7]);
glPrint(ch[8]);
glPrint(ch[9]);
glPrint(ch[10]);
....
i''ve seen this in many commercial graphic applications. it''s amazing how much faster the second version executes. it doesn''t have to perform a check on i to see if it''s < 1000, it just executes
If you feel you need this extra speed oh so much, then you could, instead of writing out a thousand lines of code again and again, find a tut that tells you how to output to files, then instead of executing the lines inside the loop, have your program write those lines every time it comes to them to file... ie...
Or something of the sort to generate a file called myloop.txt that you could just copy and paste lines of code into your .cpp.... If you want it properly indented after that, select all your code and press ALT-F8. It''s so much easier than typing out a loop.![](smile.gif)
S.
|
Or something of the sort to generate a file called myloop.txt that you could just copy and paste lines of code into your .cpp.... If you want it properly indented after that, select all your code and press ALT-F8. It''s so much easier than typing out a loop.
![](smile.gif)
S.
Kewl, thanks guys.
I was interested because I am using the DSV library (finally got the damn thing to work) and on my geforce/athlon 500, .avi playback is super fast.
On a 3dfx Voodoo3/p2-333 however, it is barely able to maintain 13 frames/sec. Now perhaps this is due to the .avi decompression but could it be the fact that for each frame of the .avi, a new texture is generated? Could this be bogging down the voodoo3, trying to create 60 new textures a second?
I guess I am trying to figure out whether its the cpu grunt, or the video card causing this bottleneck. I have a strong feeling its the voodoo3 though....
oh well this is a huge learning experience for me and it still love it![](smile.gif)
Cel
I was interested because I am using the DSV library (finally got the damn thing to work) and on my geforce/athlon 500, .avi playback is super fast.
On a 3dfx Voodoo3/p2-333 however, it is barely able to maintain 13 frames/sec. Now perhaps this is due to the .avi decompression but could it be the fact that for each frame of the .avi, a new texture is generated? Could this be bogging down the voodoo3, trying to create 60 new textures a second?
I guess I am trying to figure out whether its the cpu grunt, or the video card causing this bottleneck. I have a strong feeling its the voodoo3 though....
oh well this is a huge learning experience for me and it still love it
![](smile.gif)
Cel
Cel aka Razehttp://chopper2k.qgl.org
Easiest way to differentiate between a CPU bogging down and a video card bogging down (without having to know any fancy code that checks the cache usage of the vid ram and the usage of the cpu) is to just swap your vid cards in your machines. ![](smile.gif)
I know, it sounds wrong to have a GeForce on a p2-233 and a V3 on an athlon, but hey, it''s easy to check.![](smile.gif)
S.
![](smile.gif)
I know, it sounds wrong to have a GeForce on a p2-233 and a V3 on an athlon, but hey, it''s easy to check.
![](smile.gif)
S.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement
Recommended Tutorials
Advertisement