This is an idea I had.
Let's start by peeking at a simplified pipeline -
First, we render the Viewed From Light Shadow Maps that will contain the Z from the depth buffer. So far, as usual for shadow mapping.
Next, comes the first pass that is part of my idea. It executes a pixel shader, because we need the Zbuffer.
Shadow Map is sampled here for the first pass of the Path Tracing. It is explained why, later.
“Hit x,y,z” has simply the visible part of the projected geometry. Instead of saving the color, it saves the x,y,z of the texels that survived the Z ordering.
“Next x,y,z” is generated by using the normal of the survived texels. “Next” takes the “Hit x,y,z”(Texel x,y,z) and mirrors the ray from the camera to the new(next) direction. In the case of a mirror, this is straightforward thing to do. Later you will see how it is done for a regular texture with roughness. The new(Next) ray is saved to VRAM as the unit x,y,z of the vector.
So far, in VRAM we have a hit or a bounce of a ray. Start → center → end.
Meanwhile, the shadow mapping provides in this pass a simple shadowing. This shadowing could be already sent to monitor. It already is ready to see. But in the miliseconds we still have we will try to make it more realistic by tracing more rays around.
And here comes the compute pass -
It really performs the classical ray tracing operations. On the Start → Hit→ End that the pixel shader computed. It checks if in between Hit and End there is some triangle that obstructs the ray.
We make sure there is a dome in the game and any ray eventually hits something. The x,y,z of the pixel the ray hit is stored in another resource for the next pass. At the same time, using the normal of the new hit we can get in the same shader the direction of the next ray to trace.
The number of compute shaders executed depends of the number of ray to path-trace.
In the case of rough textures, textures with alpha and so on, the compute sahders just generate more “Next rays”. This uses more memory and computation :(((
Eventually, we can discard any pixel that falls under a direct light. Most of the times, visually it will make not much difference if adding or not path tracing to already lightened up by direct rays areas. In the case of the first/pixel shader, this is shown here:
We can see here the shadow mapping logic from the first/pixel shader pass.
The next/compute shader pass adds Ray Tracing:
It is the easier place to see RT and SM working together. Right after finding the hit, we check if it is lit directly by a light by asking the Shadow Map. The principle is the same - we transform the real world position of the collision texel to the view of the light and check if the distance to the light is more or less than the Z stored in the Shadow Map texture. Just a regular Shadow Mapping that is executed again, but here - after the first Ray Tracing. If it is directly hit by direct light, we can interrupt the path here too, in order to save computation. If we will keep path tracing, the ray extra ray from SM will contribute to the final color of the texel(the color in the origin of the path).
You can see how for the first two bounces. The first projection in the first pixel shader is checking rays if they intersect with geometry. It is Ray Tracing done by projection. In the middle is the classical Ray Tracing done by iterating over the triangles inside a BHV to find the closest hit. And the last ray is traced by the shadow mapping. If that last ray intersects with something, the Shadow Map Texture will give negative.
Like the next sketch shows, it is basically path tracing, but at every hit, we introduce an extra ray that is for free and comes from shadow mapping. It is just a read and compare. It is a valid Ray Tracing, but precomputed by the Shadow Mapping.
Because the computing is performed in the following order -
we need to have in VRAM the information showed next:
The layout of this information is distributed over various resources, but virtually, this is the struct of data per path.
The extra light sample is when the max depth(max amount of bounces) is consumed, but we can still get some more light info. The last shader could produce a last ray(or not produce a last ray) and test how much this ray is looking into the light source. It is a trick that gives approximated results. (when there is a next ray generated but no more ray tracing(Ray/Triangle intersection tests) will be performed)
The way the light affects this last ray can be tricked too -
When we read an unit Ray(Next x,y,z), we find ourselves upon the situation of very few workload for a lot of threads. To deal with that, we can use the extensions of AMD to pile workload into a flat array or into the corner of a texture. This way the CUs will have more work. Otherwise, more than half of the threads will have not a NextRay to work on(after the second bounce directly lightened texels should be discarded for real. Visually speaking, the deepest in the Path, the less the light contributes(statistically speaking, there could be exceptions of some Death Star laser beam, but for this case, in a game, i personally would tend to skip work)).
Although it is clear how it works, the real implementation in practice, could differ in the details. A LOT of work needs to be done to take the sketches to the real life implementation. So, many things could change.
If we generate only one ray from each hit, having one extra ray for free from Shadow Mapping is doubling the quality of the result. For free. The more rays we generate from a hit, the less noticeable will be the improvement that Shadow Mapping gives us.
Because of the fact that, after the first shadow mapping we already have an image that looks great enough for many games, we can just interrupt all the process of path tracing at any moment, "blend" the final color of the path and show it to the gamer. If the scene has few triangles and transparencies, the time window will allow us to compute deeper paths, but if the player turns his head around and looks into a forest, after the shadow mapping, we will get close to the end of the time window and we will abort it all and show shadow mapping forest without a single ray traced.
(Notice a serious overkill in the case of using transparent textures. SM + Alpha = a hell of layers. I started to build the pipeline in DX, but the transparencies of textures made me abandon the project(and the lack of motivation).)
(When I say “texel” i mean “point” or "the smallest voxel possible”. I mean a point in 3D space. Not a pixel taken form a texture. A point in space with a meaning, with a texture/material applied to it)