Before I describe what I've tried, let me better describe what I'm actually trying to achieve:
I have two ray generation shaders s1
and s2
. The purpose of s1
is to compute n
float values vals[0 … n - 1]
. After s1
has finished execution of all invocations, s2
needs to known the sum acc = vals[0] + … + vals[n - 1]
.
I'm having a hard time to get this working …
So, what I've tried is declaring RWTexture1D<float4> vals : register(u0);
. The issue with that is that n
is usually round about 100.000. When I try to create the corresponding resource, I get an error claiming that the width of a D3D12_UAV_DIMENSION_TEXTURE1D
cannot be larger than 25.000. I've also tried to replace RWTexture1D
by RWBuffer<float>
, but the error is similar. So, part of my question is what I should do about this. I don't really need to store all the vals
. As I wrote above, I only need to know acc
in the end. So, if necessary, I could compute multiple values in a single invocation of s1
and store the sum of them in vals
. I could clearly also use DispatchRays()
with width = height = depth = 1
so that there is only one invocation of s1
and compute the whole sum in that invocation. However, that would be rather inefficient and I would make no use of parallelization at all.
Anyways, assuming I managed to fill vals
, I actually wondered how I can compute acc
now. I clearly don't want to compute this in every single invocation of s2
. So, I thought I need a compute shader c1
for that. If there is any better option, I'm also curious to here that. I'm quite a bit lost about how I should specify numthreads
(I've never used compute shaders before).
Any help is highly appreciated!
Remark: In case this is a helpful information: s1
and s2
are not executed one after the other in every frame. s1
is only executed before s2
if something in the scene (or the camera position) has changed. That is, the value acc
stays constant as long nothing what the camera sees has changed.