Advertisement

Intel HD 620, 400 draw calls, 37FPS

Started by April 30, 2018 12:06 PM
6 comments, last by EddieK 6 years, 9 months ago

Hello everyone,

I'm making this game in OpenGL and I'm experiencing some problems. Essentially I have 100 dynamic moving turrets which are composed of two meshes (the gun and the body) with total of 532 triangles per turret. For one turret it takes 2 draw calls to render. I also have implemented shadows, so I have to render the scene twice. This adds up to 400 draw calls per frame. My GPU is Intel HD 620 and it can only render the game at ~37 frames per second.

BUT, if I render 1000 static trees which are composed of 400 triangles in two draw calls (one for rendering shadow depth map texture, and one for the main scene) I get constant 60FPS

 

So my questions is: is this draw call bottleneck possible? I read somewhere on the forums that 8400GS (or similar) can handle 2000 draw calls per frame with respectable framerate. And knowing that Intel HD 620 is more powerful, it makes no sense to me. But I also read that this is limitation of integrated graphics chips which in my case only have 64MB of dedicated memory, while 8400GS has 256MB. I would also like to know how is video memory related to the number of draw calls it can handle per frame?

I would really like to hear your opinions and views on this matter.

Thanks for the replies in advance :)

 

EDIT: I realized that drawing depth map texture into framebuffer object takes a big bite out of the performance of the game. But still I would expect better performance than that

It's not a simple problem. At first thought, I would guess it is actually a CPU bottleneck, not GPU (increasing draw calls reduces FPS). But you actually comparing rendering turrets to rendering trees. If the tree shader is simple or the trees are small on the screen, it can also be that it is a GPU bottleneck (that GPU is very bad with fill rate in my experience). It would be best to compare rendering the turrets in single draw call vs 400 draw calls.

You could also do some profiling with Intel Graphics Performance Analyser. 

Advertisement

The trees are actually bigger than turrets and they both use exactly the same shader. I will try downloading the Intel Graphics Performance Analyser and see how it goes. By the way, thanks for recommending this tool, I haven't heard about it before.

Have you tried instancing to reduce draw calls to test your theory of draw calls being your problem?

-potential energy is easily made kinetic-

Second using a graphics analyser if you can get one. Even it this is not available it's usually pretty easy to track down the bottleneck by doing a series of tests things like this:

  • Turn off texture swaps
  • Use small textures
  • Small shadow map
  • Use small screen size (postage stamp)

Check you are batching by texture / shader. I.e. If the bases and the guns use different texture, render all the bases, then all the guns, not chopping and changing each turret, so you minimize expensive state changes. Instancing if available.

Are the bases actually changing, or are they static? If they are static a whole lot of options are available you can build a big buffer of all of them pre-transformed in world space, and draw with one call. You can also cull out the backfaces if your viewpoint is fixed.

Make sure your shadow depth shader is cheap. If the viewpoint for the shadow map doesn't change, you might be able to prerender the bases into it, or prerender base shadows into the land.

This is very old information, so it may not apply to your situation, but traditionally lots of small draws is slower than a few big draws, so definetly try the instancing mentioned above, (or even combine them into one big draw).

https://stackoverflow.com/questions/4853856/why-are-draw-calls-expensive

Nvidia wrote a paper about it :-

http://www.nvidia.com/docs/IO/8228/BatchBatchBatch.pdf

 

Advertisement

Thanks for the help guys. I'm still new to OpenGL and I made a big mistake. Instead of reusing data stored in VBOs, I uploaded same data for each of the entity every frame. As soon as I stopped doing that my performance went from 70ms per frame, to 8ms per frame :D

This topic is closed to new replies.

Advertisement