Advertisement

Optimizing pipeline rendering (and draw ordering / sorting)

Started by May 14, 2019 02:24 AM
5 comments, last by JohnnyCode 5 years, 8 months ago

Hi,

I am pretty new to gamedev and am toying around with my first game.

I have my own crude 3D engine I've built using Monogame. I am trying to optimize my rendering pipeline. Here's my scenario:

I have a number of 3D playing cards I want to render. The models have  front, back and side (single model made up of different mesh parts). The back is the same for a number of models being rendered. So are the sides. The front is mostly different. For my deck, I can see the front of the cards (and not the back), and for the enemy deck, I can only see only the back of his / her cards.

The card model is made up of a single set of vertices and indexes with a different start index for the different mesh parts (front, back and sides). The textures are not atlased, which allows me to easily set the front and back textures accordingly. 

To optimize rendering, I could do the following:

1. Render each card separately (which would be made of of 3 draw calls per card viz. front, back and side). Since they all share the same vertex buffer, only the textures would differ. This reduces the vertex and index buffer switches.

2. Sort by texture and render the different mesh parts separately viz. all card backs get rendered sequentially, followed by card fronts, sides etc. This will optimize texture switches, but the vertex and index buffers will keep switching since a single card has a single vertex buffer shared by 3 mesh parts viz. front, back and side.

 

I am a bit confused about which is most optimal. I would think that rendering the card in one go would be best since the vertex and index buffers are shared, but all the reading I've done thus far indicates that you should avoid texture switches, so sorting the mesh parts by texture sounds like it's the way to go, but would result in more vertex and index buffer switching.

Any advice would be appreciated!

 

 

Out of Joker cards, only single two cards share a mark, do you have a diffrenet pre-set for how you texture them, what cards is it?

-do the cards overlap each other?

In such a case, you should sort by the their z depth to save rasterizer stage, secondary sort by tetxture the draw calls if even so.

Advertisement

They are not traditional cards, but custom cards for a trading card game I am working on.

Yes, cards do overlap on screen. 

I am currently also sorting by Z, but it's not the highest priority at the moment (in the sort key). 

If I sort by Z first, then I am essentially rendering each card (front, back and sides) sequentially and maximizing texture switching... Isn't that supposed to be a bad thing?

If you keep yourself under 10 switches, you are perfect, under 30 switches, you are good. Do not waste rasterizer stage by prioiring texture ordering drawing calls unless you need to chase mentioned numbers.

Thanks for the pointers..... along these lines, I have seen in some other engines (e.g. Unity and others) that they merge / batch static meshes into one large mesh. Since these meshes would typically be at different Z's, isn't that going to impact the rasterizer stage? I considered doing that.

The other option I played around with is instancing the meshes e.g. back of the card and using texture atlasing for the instances (and setting different world coords for each instance). Is this worth the effort? 

On 5/14/2019 at 8:53 PM, Driv said:

that they merge / batch static meshes into one large mesh. Since these meshes would typically be at different Z's, isn't that going to impact the rasterizer stage?

No, the verticies batching contains as much as it can and without vertex inputs switchings they are able to emit individual indexed draw calls to conform Z rejects.

 

On 5/14/2019 at 8:53 PM, Driv said:

The other option I played around with is instancing the meshes e.g. back of the card and using texture atlasing for the instances

If you cannot zoom-camera/scale scene, that is to force different mip-map levels to be sampled, it would be very effective to use atlased texture with minal margings. 

But the heviest operation is sampling the actual texture in rasterizer pixel shader stage, not its memory upload during switch. I have used 128x128 individual mipmapped textures with 150 swtichings on intel HD card without botllenecking actual screen framerate.

This topic is closed to new replies.

Advertisement