Advertisement

StructuredBuffer and matrix layout

Started by September 14, 2014 08:40 AM
3 comments, last by KaiserJohan 7 years, 5 months ago

Not sure if this is a bug of a feature, but apparently this code always generates column-major matrices:


StructuredBuffer<float4x4>

This is regardless of using D3DCOMPILE_PACK_MATRIX_ROW_MAJOR or #pragma pack_matrix(row_major).

Anyone has an elegant way to fix it? It's a real irritating 'feature'.

Yes, I've seen that behavior as well when using compute shaders. My workaround was to use a StructuredBuffer<float4> instead and use 4 loads. In terms of the compiled assembly this isn't really any less efficient, since a float4x4 by will get split into 4 loads when compiled to assembly (you can only load 4 DWORDs at a time from structured buffers).

Advertisement


Yes, I've seen that behavior as well when using compute shaders. My workaround was to use a StructuredBuffer instead and use 4 loads. In terms of the compiled assembly this isn't really any less efficient, since a float4x4 by will get split into 4 loads when compiled to assembly (you can only load 4 DWORDs at a time from structured buffers).

While searching for an answer I stumbled upon your blog, which is the just about the only place that this packing issue is mentioned (bad MSDN...). I actually used your advise and loaded the 4 vectors myself, just to find out that the skinning shader went from 46 instructions with column-major to 176 instructions with row-major.

I knew that row-major is more inefficient than row-major, but x4 instructions is just too much.

Sorry for bump but I just stumbled upon this recently. Was frustrated as I already had my matrices column-major when sending to the shader so I couldn't understand why it got sampled transposed.

I just transposed() the matrix after sampling in the shader.

This topic is closed to new replies.

Advertisement