Advertisement

Constant buffer padding for array of structs

Started by November 14, 2017 08:09 AM
8 comments, last by galop1n 7 years, 2 months ago

I am confused why this code works because the lights array is not 16 bytes aligned.


struct Light
{
    float4 position;
    float radius;
    float intensity;
    // How does this work without adding
    // uint _pad0, _pad1;
};

cbuffer lightData : register(b0)
{
    uint lightCount;
    uint _pad0;
    uint _pad1;
    uint _pad2;
    // Shouldn't the shader be not able to read the second element in the light struct
    // Because after float intensity, we need 8 more bytes to make it 16 byte aligned?
    Light lights[NUM_LIGHTS];
}

This has erased everything I thought I knew about constant buffer alignment. Any explanation will help clear my head.

Thank you

Just compile your code with FXC and see the printed layout. Are you sure it works?

I compiled this:


struct Light
{
    float4 position;
    float radius;
    float intensity;
};

cbuffer lightData : register(b0)
{
	uint dummy1;
	Light lights[9];
	uint dummy2;
};

float4 main(uint idx : SV_VertexID) : SV_Position
{
	return (dummy1 + lights[0].position * lights[7].radius + lights[8].intensity + dummy2).xxxx;
}

And got this:


// cbuffer lightData
// {
//
//   uint dummy1;                       // Offset:    0 Size:     4
//
//   struct Light
//   {
//
//       float4 position;               // Offset:   16
//       float radius;                  // Offset:   32
//       float intensity;               // Offset:   36
//
//   } lights[9];                       // Offset:   16 Size:   280
//   uint dummy2;                       // Offset:  296 Size:     4
//
// }

It will align lights[0] at 16-bytes. From this it looks like sizeof(Light)=32. What puzzles me is 32*9 = 288 and not 280 as reported. What's totally NOT understandable is dummy2 being at offset 296 = 16 + 32*9 - 8, as if it figured that in lights[8], there's 8 bytes padding, so let's put dummy2 there. Would anyone care guessing wtf?

So I totally don't understand this now :D My FXC.exe is 6.3.9600 from Windows Kit 8.1.

One thing is for sure, directly in cbuffers, you can have 4-byte constants (uint, float, int, ...) on 4-byte boundaries. Also, structs will be aligned to 16 bytes.

The final takeaway is to always query the compiled version for offsets of individual constants, so you know where to memcpy what.

Advertisement

Yes. This has broken my understanding of the constant buffer alignment. The behavior seems odd. If you place the numLights after the lights array, everything breaks.


cbuffer lightData : register(b0)
{
    // If this part is placed after the lights array, everything breaks
  	///
    uint lightCount;
    uint _pad0;
    uint _pad1;
    uint _pad2;
  	///
    Light lights[NUM_LIGHTS];
};

 

If you put an array of some type in an array inside of a cbuffer, the compiler will insert padding if the size of the type is not 16-byte aligned. So in your case your light struct is 24 bytes, so you'll get 8 bytes of padding after each element in the array. It will also always start the array on a 16-byte boundary, which means there may be some padding before the array depending on what else is declared in the cbuffer.

unless on some nvidia hardware with last chance optimization, you should just stick to a structured buffer for large storage of constants. You do not have the exotic ( not c++ compatible ) alignment rules, you do not have the restriction of updating the full buffer ( dx11.1 windows 8 minimum for that, just saying fyi, as dx12 is not a problem here anyway ).

MJP, I understand why the 24 byte struct in array becomes 32 bytes, but why not its final element?

Advertisement
7 hours ago, pcmaster said:

MJP, I understand why the 24 byte struct in array becomes 32 bytes, but why not its final element?

Because it would be waste, and because it is not C++. Don't overthink it z)

15 minutes ago, galop1n said:

Because it would be waste, and because it is not C++. Don't overthink it z)

Then my recommendation remains - always use shader reflection (ID3D11ShaderReflection) to get the offsets of individual members, you can assume almost nothing.

1 hour ago, pcmaster said:

Then my recommendation remains - always use shader reflection (ID3D11ShaderReflection) to get the offsets of individual members, you can assume almost nothing.

My recommendation is even better, use a StructuredBuffer for that :)

This topic is closed to new replies.

Advertisement