Hi,
I have implemented instancing in my D3D11 engine by using constant buffers:
struct CInstance
{
float4x3 WorldMatrix;
};
cbuffer InstanceBuffer: register(b0)
{
CInstance Instance[2];
};
The maximum amount of instances per draw call is not known at compile time, because it depends on the primitives currently visible on screen (engine dynamically fills the constant buffer with Map/Unmap). To force an index access in the HLSL shader, I've defined a minimum of 2 instances.
Until now, this works very well on multiple computers (AMD, Nvidia and Intel graphics cards), but now I receive error reports from some of my costumers where objects are missing on screen.
It seems that the "indexing hack" does not work on all graphics card/driver combinations (DX debug does not complain). I thought that indexing always works as long as the underlying buffer is big enough (which it is), but then I've found the following statement in the DirectX specs: https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.5%20Constant%20Buffers
If the constant buffer bound to a slot is larger than the size declared in the shader for that slot, implementations are allowed to return incorrect data (not necessarily 0) for indices that are larger than the declared size but smaller than the buffer size.
So my question is, how can I implement dynamic instancing, where the amount of instances is determined at runtime? Should I always declare the maximum possible instance count in the shader? This seems to be a performance issue to me when I upload a full 64k sized constant buffer for only 2 or 3 instances.
Kind regards