Vulkan bidirectional path tracer

Shawn Halayka · 2023-11-02T00:03:48

I'm writing a bidirectional path tracer using Vulkan and C++/GLSL. The code and model file are at: https://github.com/sjhalayka/bidirectional_path_tracer As you can see, there is a spurious light source from outside of the box. It's not supposed to be like that. Anyone with experience in bidirectional path tracing?

Graphics and GPU Programming Programming

Started by taby October 31, 2023 01:51 AM

17 comments, last by taby 1 year, 4 months ago

taby

Author

1,558

October 31, 2023 11:33 PM

Vilem Otte said:
taby said:
The only reason that I did the things this way because there is no recursion in Vulkan/GLSL.
You can simulate recursion with stack. That's not a problem (which is the common way how you do a traversal through tree).

Please let me have some time to go through your comments. I thank you for the plethora of ideas to try (and not to try).

Vilem Otte

3,390

October 31, 2023 11:34 PM

JoeJ said:
You say Russian Roulette is needed to avoid bias. Is this because RR could - in theory(?) - generate paths of infinite lengths to get infinite bounces? Can't be. (I'll never know the precise meaning of ‘biased’ in PT : )

In path tracing we are generally looking for ACCURATE solution of light propagation through the scene into the eye. That means that eventually at infinite samples we would reach exact results (now, this will practically happen sooner than in infinity - as we have limited precision of how we display color!) - i.e. with unbiased rendering the error compared to ground truth is 0. In terms of biased rendering - we trade some accuracy for bias - we will never converge towards actual result (but - the bias error can be consistent - usually in form of blur or such) which still makes such methods viable - you compute faster at the cost of accuracy.

Mathematically speaking - our path tracer is an estimator - expected value of unbiased estimator is the population mean, regardless of the number of observations. Error in each estimation is therefore only due to random statistical variance (high-frequency noise). Variance is reduced by sampling n times (std. deviation is sqrt(n)) - i.e. to reduce standard deviation of error in half we have to sample 4 times. Biased estimator introduces a bias error in effort to reduce high frequency noise (i.e. you will never reach accurate value, but only reach it as close as your bias error is - the main point here is, as long as your bias error is consistent it is often considered “good enough” for some applications).

JoeJ said:
Let's say we support 4 bounces, and so no need for RR. Further assuming each traceRay call takes a similar amount of time, threads should not wait on each other too long.

Just fyi: If we support 4 bounces, such estimator is by definition biased already.

Each traceRay can't take similar amount of time by definition - the scene would need to be same-complex everywhere (only single-primitive scenes without acceleration structure, or things like spheres with infinite radius would fit). I'd like to point you at my traversal code (something similar will happen under the hood) F.e. here https://pastebin.com/43yRFBHu - the requirement of stack-based traversal and generic scene determines that time taken in traceRay will vastly differ. This was quite in-depth described here - https://research.nvidia.com/sites/default/files/pubs/2009-08_Understanding-the-Efficiency/aila2009hpg_paper.pdf

JoeJ said:
This applies to individual samples, but not to the entire signal we try to integrate. We can imagine the signal as an environment map seen from the cluster of our pixels with similar normals. The light paths give us some information about this environment. A path that is good for a single pixel, likely is good for all pixels in the cluster. That's what i've meant, not the other problem caused be random eye path after the first hit. Sounds you have tried this idea, but with a larger number of light paths stored in VRAM. At a larger scale, it reminds me on Photon Mapping a bit.

Exactly. It was interactive back in 2014 or so - with today's GPUs you could probably get to real time framerates (although - the increased resolution on displays plays a bit against you … going from 1080p to 4K means that you need 4-times computational power and memory - GPUs back then had like 1GB - 3GB memory if I'm not mistaken (Titan Z had 12GB)). Nowadays, I've got RX 6800 here with 16GB of memory … computational power is MUCH bigger though.

My idea was - the light paths can be generated completely separately and regenerated when needed, this saves you from tracing all light paths all the time completely (you could possibly also regenerate more often the ones that are not yielding any connections, not just the ones impacted by animated geometry/lights if there are any). This can save quiet a lot on light paths part - plus should yield better results for interactive rendering (with some filtering you could even get smooth image in real time).

JoeJ said:
In Tabys example there are two paths per pixel, which is already a big ray budget. If every pixel needs to trace one more ray to 63 other light path vertices, or even #bounces times 63, that's surely too much. : )

The main problem to which this boils down is, who generates the single light path for block? If it is the same workgroup that solves then eye-path and connection - it doesn't make sense, as the workgroup still needs to wait for the single thread that generates the light path first. The whole time workgroup takes would be dependent on whether light path is going to be “short” (i.e. minimum number of bounces, going through minimal part of acceleration structure) or “long” (i.e. many bounces, going throughout “whole” acceleration structure). Keep in mind that remaining 255 threads have to wait during that step - they have nothing to connect to at that point.

This being said … you could possibly use 1st thread to calculate light path, and let other threads calculate their own eye paths … once the 1st would finish, rest would do connection steps. You (though) won't have any results for 1st pixel (but you could just always use different thread to be ‘light’ one … and after 256 samples you have 255 samples per each pixel).

The other approach would be to start a kernel where each thread would generate N light paths (N being number of your tiles in viewport) - and then each tile would just trace eye path and connect to that single light path. The problem at that point is - you still store the light path in global memory.

…

This all being said - there are generally 2 approaches, first being “megakernels” (where you do whole single-sample path tracing within a single kernel) and another being … well … I can't remember the name, let's say "iterative" (where you keep paths alive, resolve, and restart them once used … each time your kernel runs, you advance each path by single segment). Both are applicable for bidirectional path tracing - yet the latter ends up in higher occupancy. Although I'm probably going here too deep and too far for Taby (sorry for the spam!).

EDIT:

taby said:
Please let me have some time to go through your comments. I thank you for the plethora of ideas to try (and not to try).

Feel free to, I hope I don't sound too rude in the comments - if I do, then sorry. I literally wrote what I had on my mind when reading the lines and trying to understand the code. If you want any clarification or such - feel free to ask, I should be available.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

JoeJ

4,430

November 01, 2023 01:44 AM

Vilem Otte said:
The main problem to which this boils down is, who generates the single light path for block?

Needs to be it's own dispatch, thus also the need to store in VRAM. (As you say few lines down.)
Since we share the path by many workers, using global memory can't be a bottleneck in practice.

Vilem Otte said:
Each traceRay can't take similar amount of time by definition -

That's clear ofc., but we can try to make runtime more uniform by using the same path length for all paths, if we want to trace them in parallel within a single kernel.
Besides, after the introduction of HWRT, i would treat the runtime of each ray as constant and done. It's no longer our problem.

Vilem Otte said:
Just fyi: If we support 4 bounces, such estimator is by definition biased already.

Ok. I though i could define ‘my ground truth is 4 bounces, but not more’. And then i could claim my accurate solution is unbiased, while i also have a biased approach which is faster.
I mean, if we don't allow to define ground truth as needed, all we could do is using real world physics as reference. But then anything would be biased, and we have no need for the term.
Another example is two path tracers, but only one models fog. The other can only do surfaces. But they both can be unbiased, no? If you agree, didn't you just contradict your quote above?

Actually, only the mathematical meaning of the term seems well defined. But again, the observation your estimator does, is actually a sample of a function or ground truth as defined by us at free will?

… still confused about that biased term. : )

Vilem Otte said:
That means that eventually at infinite samples we would reach exact results (now, this will practically happen sooner than in infinity - as we have limited precision of how we display color!)

Let's say we keep iterating bounces until the contribution falls below floating point accuracy. This can still cause a path length of 1000 in our test scene. And although it happens rarely, it still hurts performance a lot for no visible win.
Thus, everybody will set an upper limit, say 10.

And then, they run around and claim ‘My path tracer us unbiased!’, while they additionally silently think ‘…well, if i would remove this upper bound.’ ?

:D Hehe, i insist… it can't be that capping bounces makes your PT biased.

JoeJ

4,430

November 01, 2023 01:49 AM

[forum bug :/]

Aressera

3,185

November 01, 2023 03:56 AM

JoeJ said:
still confused about that biased term

A path tracer is evaluating a recursive integral at every surface point of how much light is received from every other visible point in the scene (and how much each of those other points receives from all of the scene visible to those points). The ground truth for path tracing is for an infinite number of bounces, i.e. infinitely recursive integral. By truncating the integral at 4 bounces you effectively ignore all light contributions from further bounces. This makes your path tracer biased compared to the ground truth. Is an unbiased path tracer the same as real physics? No, but it is accurate for the mathematical models that are used to define the light interactions (e.g. BRDF).

Russian roulette is a mechanism to effectively achieve the same result as infinite bounces but with finite computation, by probabilistically terminating paths.

In practice though I do define a maximum number of bounces to limit the maximum path length and computation time. Though I work on path tracing for acoustics where you need at least 100-200 bounces to compute the acoustic response, and Russian roulette doesn't work because of the addition of a time dimension to the integral.

JoeJ

4,430

November 01, 2023 08:46 AM

Aressera said:
Is an unbiased path tracer the same as real physics? No, but it is accurate for the mathematical models that are used to define the light interactions (e.g. BRDF).

But who defines the mathematical model? Is Kajiyas rendering the only option for a definition? It does not define a BRDF, and i can define one at will. Why don't i have the same freedom with introducing a maximum bounces?

Aressera said:
Russian roulette is a mechanism to effectively achieve the same result as infinite bounces but with finite computation, by probabilistically terminating paths.

If you terminate every path at some point, which you have to, then an unbiased path tracer is not possible.
Idk about RR, but i guess there must be a condition to terminate, e.g. low contribution. That's a threshold value, no different from a max bounces constant, and causing the same consequences.

Terminating paths causes finite bounces, no matter if we do it stochastically or with a constant limit.
But if we do it stochatsicailly, it is ‘cool’ enough so we can claim unbiased results? There must be a better argument than that.

To sum up and rephrase the question:
Only if we converge towards the exact result considering infinite bounces our result is unbiased.
Path tracing can not do infinite bounces (while other methods can). For practical reasons, we have to terminate paths to avoid infinite loops.
But we could - in theory - use our PT as is, removing max iteration thresholds, and then it would match ground truth. It would, so our path tracer is unbiased because it could be unbiased but isn't.

That's a contradiction and paradox. I'll stay away from using the term but keep using ‘error’ instead.

Vilem Otte

3,390

November 01, 2023 11:00 AM

JoeJ said:
But who defines the mathematical model? Is Kajiyas rendering the only option for a definition? It does not define a BRDF, and i can define one at will. Why don't i have the same freedom with introducing a maximum bounces?

The author of given method decides what is the mathematical model he wants to simulate.

JoeJ said:
If you terminate every path at some point, which you have to, then an unbiased path tracer is not possible. Idk about RR, but i guess there must be a condition to terminate, e.g. low contribution. That's a threshold value, no different from a max bounces constant, and causing the same consequences.

This goes a lot into math - theorem is that RR is unbiased estimator of the limit of sequence. If E denotes expected value with respect to the randomness of halting time, we have:

Proof: T being total number of iterations performed by estimator Y and 1_{i<T} denote variable that is 1 if i<T and 0 otherwise. We can write RR as infinite sum:

Note that all terms after T will be zero - T also follows geometric distribution with parameter p, we can use this to compute:

Last term being cumulative distribution function of geometric distribution with parameter p. Which is:

And therefore:

Putting this together proofs the unbiasedness of RR:

Which proves that RR is unbiased.

Note: There is one condition - and that means that randomness in the sequence Y_i is independent of halting time!

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

taby

Author

1,558

November 02, 2023 12:03 AM

Vilem Otte said:
EDIT:
taby said:
Please let me have some time to go through your comments. I thank you for the plethora of ideas to try (and not to try).
Feel free to, I hope I don't sound too rude in the comments - if I do, then sorry. I literally wrote what I had on my mind when reading the lines and trying to understand the code. If you want any clarification or such - feel free to ask, I should be available.

No, you didn't come across as rude. It's just that I promised my time to another project, and I'll have to put this on the back burner. Thank you again for all of your expertise!

Vulkan bidirectional path tracer

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Vulkan bidirectional path tracer

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines