Hi guys,
is it possible to copy from RWStructuredBuffer<float2x4> to a cbuffer of the same size using CopyResource function?
According MSDN if size, format, etc is the same, it should work.
There is a note "You can't use an Immutable resource as a destination." - I guess by immutable they mean D3D11_USAGE_IMMUTABLE, so I used radher D3D11_USAGE_DEFAULT.
the RWStructuredBuffer<float2x4> is created as this:
D3D11_BUFFER_DESC desc;
desc.ByteWidth = 2048; //64 lights * size of float2x4
desc.BindFlags = D3D11_BIND_UNORDERED_ACCESS;
desc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;
desc.StructureByteStride = 32; //size of float2x4
desc.Usage = D3D11_USAGE_DEFAULT;
hr = m_p_device->CreateBuffer(&desc, 0, &sourceBuffer);
D3D11_UNORDERED_ACCESS_VIEW_DESC uavd;
uavd.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
uavd.Format = DXGI_FORMAT_UNKNOWN;
uavd.Buffer.NumElements = 64;
hr = m_p_device->CreateUnorderedAccessView(sourceBuffer, &uavd, &sourceBufferView);
// generating 64 lights and store them in the sourceBuffer
then the cbuffer is created as this:
D3D11_BUFFER_DESC desc;
desc.ByteWidth = 2048; //64 lights * size of float2x4
desc.BindFlags = D3D11_BIND_CONSTANT_BUFFER;
desc.Usage = D3D11_USAGE_DEFAULT;
hr = m_p_device->CreateBuffer(&desc, 0, &destinationBuffer);
then the copy is done via deferred context:
m_p_deferred_context->CopyResource(destinationBuffer, sourceBuffer);
// call the final lighting shader
In my lighting shader, I have 64 lights, float4 for color, float4 for position in view space, therefore float2x4.
The colors and positions of the lights are generated in another shader on the fly, so I store them in RWStructuredBuffer<float2x4>.
Then in my final lighting shader, I have to read all 64 lights per pixel, so I could just read the data again from RWStructuredBuffer<float2x4>.
However, since I'm doing tons of other texture reading, I think it totally breaks the texture cache, because I get a huge fps drop.
So I tried to move the RWStructuredBuffer<float2x4> data into a cbuffer and I got almost double performance.
The problem is, it appears that the data layout of these buffers is somehow different.
For debuging, I divided the screen into 8x8=64 squares and every square displayes a color of the light from the RWStructuredBuffer<float2x4>;
If I read it as RWStructuredBuffer<float2x4>, everything is correct a few red, green and white lights:
However if I read it now from the copied cbuffer, I got this, the color channels are somehow messed up.
Obviously, some data was copied and even the pattern was preserved:
Any idea, what could happend, how to do it correctly?
I could just do Map/Unmap, but since it's a deferred context, it's a bit tricky, moreover, I'd like to avoid any CPU communication and another staging buffer, so I'd like to just use CopyResource.
Thanks.