Hello,
Hodgman has talked extensively about stateless rendering it is definitely the way to go but I’m still not happy with my implementation and need some pointers. (Note I am trying to support DX11 and DX12)
http://www.goatientertainment.com/downloads/Designing%20a%20Modern%20GPU%20Interface%20NOTES.pdf
State Cache and Redundancy:
My current solution is to generate a PSO from various abstract descriptors basically DX11/12 BlendDesc etc. When CreatePiplineState() is called I hash the descriptor and store it in a hash map if its unique and store the pointer in the pipeline state object which is stored in a (non-hashed) pool. The function returns a PiplineStateHandle too access the pipeline object when setting it. When drawing you fetch the PSO and check each state too see if its already set, else set it and note that it’s the current set state. Also, I currently cannot delete hash mapped states.
With Hodgman’s solution, you store the ID’s of the states in the draw item and each one takes up a certain number of bits: BlendState == 8 bits etc. So Instead of storing the pointer (ID3D11BlendState* m_pBlendPrevious) You store the previous DrawItem and do an XOR against it and check which states are non-zero and need to be bound.
So, if you use DrawItems, where do PiplineStates come into it? If you store the pipeline state as an ID you basically drop all shaders and states into a single value. If you simulate a PSO in DX11 then you can store the bit ID’s of each state in a PSO object in the graphics device pool and then perform the XOR with the object to resolve redundancies? On DX12 do you just check if the pipeline object is the bound object and leave it at that?
Can Hash Map be replaced:
Storing all the states in a hash map doesn’t really feel right, im thinking of creating a pool that is split into 2 sections, Engine and User generated states.
So, for instance the first 4 BlendStates will be:
1. Opaque
2. AlphaBlend
3. Additive
4. NonPremultiplied.
The remaining 252 (if 8 bit ID) will be user generated states.
But what about a clash? With a map you can hash and find a duplicate and reuse it, with a pool you would have to linear scan from 4-256 to find the element with the same hash. You could store a map into the pool but that means storing a pool and a map for each state.
Would it be better to just hardcode the most common states and leave it at that?
Also currently I do not delete the states because there’s no way of knowing if someone else is using it, is it okay to just leave the states in a pool until the engine closes? Chances are there going to be reused often with going in between levels and such and the relative number of states being quite low, for instance the 4 base blend states are probably all you will ever need for most cases.
Sampler State Redundancy and Sharing
From DX11 onwards you suppose to store a Texture and a Sampler separately allowing for efficient reuse of samplers. How would you handle this though? For instance in unity a Texture is a texture sampler bundle just like in dx9, but this would limit us to 16 textures, if the shader is written to reuse samplers though then when you bind textures there sampler binds would be ignored and share what ever was set in the slow they wanted to use. Does anyone have an elegant solution for this?
Architecture Goal:
Im using ECS for certain reusable aspects, so for the rendering I will have a render component that basically generates and caches the DrawItems for its mesh/Submeshs and submits them to the renderer too be sorted into the queues (Opaque, Geometry, Alpha test, Transparent etc) and deals with the submission. Guessing that’s how most people would do that.
Conclusion:
If anyone has insight into anything, I mentioned above it would be great to hear your opinion on the subject. Particularly in reference to the caching and sampler slot issue as that’s one of the major refactors I’m trying to do and make it more like Hodgman’s with the DrawItem and RenderPass submissions.
Thanks!