implement and understand voxel cone tracing

evelyn4you · 2018-04-26T11:45:08

Hello, i try to implement voxel cone tracing in my game engine. I have read many publications about this, but some crucial portions are still not clear to me. At first step i try to emplement the easiest "poor mans" method a. my test scene "Sponza Atrium" is voxelized completetly in a static voxel grid 128^3 ( structured buffer contains albedo) b. i dont care about "conservative rasterization" and dont use any sparse voxel access structure c. every voxel does have the same color for every side ( top, bottom, front .. ) d. one directional light injects light to the voxels ( another stuctured buffer ) I will try to say what i think is correct ( please correct me ) GI lighting a given vertecie in a ideal method A. we would shoot many ( e.g. 1000 ) rays in the half hemisphere which is oriented according to the normal of that vertecie B. we would take into account every occluder ( which is very much work load) and sample the color from the hit point. C. according to the angle between ray and the vertecie normal we would weigth ( cosin ) the color and sum up all samples and devide by the count of rays Voxel GI lighting In priciple we want to do the same thing with our voxel structure. Even if we would know where the correct hit points of the vertecie are we would have the task to calculate the weighted sum of many voxels. Saving time for weighted summing up of colors of each voxel To save the time for weighted summing up of colors of each voxel we build bricks or clusters. Every 8 neigbour voxels make a "cluster voxel" of level 1, ( this is done recursively for many levels ). The color of a side of a "cluster voxel" is the average of the colors of the four containing voxels sides with the same orientation. After having done this we can sample the far away parts just by sampling the coresponding "cluster voxel with the coresponding level" and get the summed up color. Actually this process is done be mip mapping a texture that contains the colors of the voxels which places the color of the neighbouring voxels also near by in the texture. Cone tracing, howto ?? Here my understanding is confus ?? How is the voxel structure efficiently traced. I simply cannot understand how the occlusion problem is fastly solved so that we know which single voxel or "cluster voxel" of which level we have to sample. Supposed, i am in a dark room that is filled with many boxes of different kind of sizes an i have a pocket lamp e.g. with a pyramid formed light cone - i would see some single voxels near or far - i would also see many different kind of boxes "clustered voxels" of different sizes which are partly occluded How do i make a weighted sum of this ligting area ?? e.g. if i want to sample a "clustered voxel level 4" i have to take into account how much per cent of the area of this "clustered voxel" is occluded. Please be patient with me, i really try to understand but maybe i need some more explanation than others best regards evelyn

Graphics and GPU Programming Programming DX11

Started by evelyn4you April 20, 2018 12:07 PM

20 comments, last by Vilem Otte 6 years, 9 months ago

JoeJ

4,399

April 21, 2018 01:15 PM

Ok, sounds lightmaps is the best way to go for you - i have little doubts about this. I don't see a need for a completely dynamic solution like voxel cone tracing.

I lack good experience on baking and uv unwarp with applications like Blender, but i would try this first to save a lot of time.

The primary issue is, to support normal mapping (if necessary) your light map texels need directional information, encoded in something like primary light direction, or spherical harmonics (2 band SH is mathematically the same as primary direction: 3 values for direction and a 4th for constant ambient, so easy to understand and good enough), or spherical gaussians etc.

You might wanna open a new topic and ask if / how Blender can generate this directional data, or if someone knows an application the can.

If you can't find a practical solution and need to bake yourself, MJP has good blog about it: https://mynameismjp.wordpress.com/2016/10/09/new-blog-series-lightmap-baking-and-spherical-gaussians/ and he also has a nice baking project on github. It's c++, but you might be able to use it (if allowed). There should be others as well. Beside Optix there is Embree and Mitsuba for ray tracing. All C++ of course, but maybe there are C# wrappers around.

For automated UVs i know those two: https://github.com/Microsoft/UVAtlas/wiki/UVAtlas (Halo) and https://github.com/Thekla/thekla_atlas (from The Witness)

For dynamic objects like characters a dense grid of low frequency probes seems the easiest approach, using SH2, SH3 or SG. (Low res cube maps would also work, but that's probably slower.) This is another point where i'm unsure if blender can generate this easily, but you could add small cubes to the scene (not casting shadows or contributing to GI), attach textures to each side, and turn the resulting lightmaps into SH or whatever afterwards yourself. I've learned about SH from here: http://www.paulsprojects.net/opengl/sh/sh.html UE4s advanced approach if this: https://docs.unrealengine.com/en-us/Engine/Rendering/LightingAndShadows/VolumetricLightmaps

Just some spotlights and ambient as you suggest might work, but the more photorealistic / consistent with the background you want to be, the harder it becomes.

For reflections (if necessary) a sparse grid or manual placement of probes is state of the art, missing occlusion for reflections is probably the most obvious problem we have with current graphics, but it should be acceptable.

From all this you should be able to get something comparable to the typical UE4 look using baked lighting, so high quality. (But remember how good the Witness looks even without normal maps - it uses only simple high res light maps)

56 minutes ago, evelyn4you said:
- lights have static position but with slow day and night cycle. ( At the sunset sitting on terrace .. )

Sounds dynamic time of day is not super important and you might be able to use a changing enviroment probe or even blending between 2 or 3 light maps.

Otherwise this is an indication to precalculate just light transport and update final lighting at runtime, like Enlighten does. Good article here: http://copypastepixel.blogspot.co.at/ Those ideas are compatible with both lightmaps and probes, but i don't think you need to go there.

... some work eventually it would still make sense to switch to the Unity / UE4 wagon instead. Porting your character tech to those should be surely possible, and you get TAA, AO, etc. as well. However, personally i really like to see people still develop custom engines!

evelyn4you

Author

April 23, 2018 02:59 PM

Hello JoeJ ,

many thanks for your manifold hints. At the moment i am still overthinking

- implementing anisotropic Voxel GI ( according to my studies several, different methods exist

- improve my Reflection CubeMaps / Light Probes method für GI

Some results about Light Probes

I changed the code for more efficent creation of Cube maps an tested performance with Sponca / Atrium
- 1 DirLight with shadowmap
- 1 long range point light with omnidirectianl shadowmap
- some dozend Pointlights without shadowmap
- 100 GI Cubemaps ( all the same resolution e.g. 32 (64, 128, 256 ) with Mipmapping

Frustum culling for every cubemap an update dynamically only when something in the frustrum changes

To my big supprise the resolution of the cubemap has nearly NO effect of the renderperformance until the resolution remains under 512.

That is quite astonishing. It seems that the pixel shader here is not the bottleneck

Eg. with 2 dynamic cubmaps i get 50 FPS for the scene, The performance decreases to 1 FPS with 100 dynamic cubemaps.

So it shows that about 3 dynamic cubemaps per frame do not give a big render hit. The reflections are nice eye candy

For slow changes all the e.g. 100 cubemaps could be changed by a cycle of about 1 sec.

But In this scenario i have to place the maps manually which is most efficent to renderperformance but bad when editing the scene. And 100 still too less.

This is too slow, fast lighting changes e.g open a door in a room that gets filled with light will not look good. The lighting will be modulated too slow.

My method is still too inefficent, because the low res lightmaps e.g. 32 version should be doable much faster.

I would like to implement a dynamic regular grid of light probes later SH2, SH3 and interpolationg them ( only in space where there is no vertecie data )

My frustrum culling is checking the planes of the view pyramid with so to say endless depth. I dont know how to implement a good layered method. When creating light probes in a regular grid the should not be treated individually but "the knowledge/information" of the already updated cubemaps should be used to update the others much faster.

JoeJ

4,399

April 23, 2018 05:54 PM

2 hours ago, evelyn4you said:
To my big supprise the resolution of the cubemap has nearly NO effect of the renderperformance until the resolution remains under 512.
That is quite astonishing. It seems that the pixel shader here is not the bottleneck

So you want to render cube maps in real time? I assume what is happening is that the GPU is not saturated when res is < 512 so you don't get a win from lower res.

Personally i work on realtime GI and i can update maybe 60000 spherical enviroment maps at < 4ms IIRC. 10000 seem necessary for a average game. Resolution is 4x4 or 8x8 - enough for bump mapping and reflections for rough metals. I do not utilize hardware rasterization, it's all compute shaders. Very complex, 10 years of work. See the Many LODs or Imperfect Shadow Maps papers for similar and simpler but slower ideas.

I think Cryengine has some rasterized realtime cubemaps functionality, maybe they explain how much is possible here. (They also have a nice Voxel GI implementation and describe it on their manual pages.)

2 hours ago, evelyn4you said:
This is too slow, fast lighting changes e.g open a door in a room that gets filled with light will not look good. The lighting will be modulated too slow.

What helps here is a simple temporal filter (moving exponentional average): visibleX = visibleX *0.9 + updatedX * 0.1

It makes the transitions smooth so more pleasant to the eye, but updating 2 of 100 probes per frame will always be just too slow.

Here is some recent work making compromises of precomputation and dynamic stuff achieving high quality: https://users.aalto.fi/~silvena4/Projects/RTGI/index.html

... pretty complex as well, and still limited and expensive.

2 hours ago, evelyn4you said:
But In this scenario i have to place the maps manually which is most efficent to renderperformance but bad when editing the scene.

Manual editing takes time, but coding automation or alternatives may take even more time

My personal advise and warnings:

If you decide to go into GI research now, you likely waste your time. Others will solve it before you. Even if you decide to implement existing methods, trying this and that, you may waste time as well. Personally i made this decision many years ago and to produce results (hopefully soon), i've made some sacrifices: Stop working an a game, Stop working on second interests, mainly about characters: walking ragdolls, improved skinning, muscle and bones simulation, correct anatomy - similar interests to yours. There is not enough time for everything, so i just think you should keep your focus on characters, and don't try to deal with unsolved problems about lighting.

JoeJ

4,399

April 24, 2018 06:51 AM

Here is a recent VXGI presentation from NV: http://on-demand.gputechconf.com/siggraph/2014/presentation/SG4114-Practical-Real-Time-Voxel-Based-Global-Illumination-Current-GPUs.pdf

Some tricks about dealing with the limitations - i'm surprised they offer anisotropic voxels, but guess it's a high end GPUs feature.

(I'm even more surprised about their claim they've invented 'clipmaps instead octree' - what's next? Maybe they invented the wheel as well? )

Edit: picked the old presentation accidently, here's the new one: http://developer.download.nvidia.com/assets/gameworks/downloads/secure/GDC18_Slides/Advances in Real-Time Voxel-Based GI - GDC 2018.pdf?7vYTOyxJCLXubKH1AR2CPtOlLNPbFGukj0ThdO5JdNcV0uE_B6jwD9PUCWXwhNaZweCTUbBQts_WE5JwjLOjGDtfSWGLj9JMLa6pigXEjMalOCbBomdpVtSsb5DOK-y1lIUrUl3KV0NSyMxo1Ofza1mDosiD4awrRzJEMDn8Pg223j71nQRNjyF-rjOaxWVTxgXhHRWTlgCFuc6slonL-Tk1qs6DKCWYu40tMxc5kQ

Seems they dropped anisotropc voxels now. (Usually you get the same advantage just by doubling resolution, which is cheaper).

Vilem Otte

3,390

April 25, 2018 12:21 AM

@JoeJ My feelings about VXGI are somewhat mixed (and you may remember me implementing it here on DX12, and it's actually one of the features in my engine now).

First, shameless self-promotion:

And I recommend looking carefully to the video, as it may show a lot of flaws for VXGI (and why my feelings about it are mixed) - and yes it also uses reflections from cone tracing:

Noise - clearly visible, this can be solved (to some extent) with temporal filter
Resolution - in this case it is quite high (note sharp reflection in the sphere), although it's important to note that this is a limiting factor
Scale - hand in hand with resolution, your scale for voxel volume is going to have significant impact on quality of GI, obtaining lightmap-like results is possible, but requires careful handling
Octrees (SVO) - significantly reducing amount of memory used, yet significantly increasing traversal cost. Not worth it, 3D texture outperforms it literally every time and large-scale scenes are performance heavy to build SVO from
Generation - changing lighting or moving around objects will result in re-generation (or at least some amount of computation) among voxels. This may be costly, depends on the scene complexity (that intersects voxel volume of course).
Anisotropic voxels - not worth it. Increasing resolution is easier way around, with better results which doesn't increase computational complexity.
Static vs. Dynamic - generally pre-computing static data and then inserting dynamic is worth it. Requires additional 1 bit flag though.

Compared to other GI solutions I've tried (that have to be mentioned - and were somewhat successful on my side):

Reflective Shadow Maps (RSM) - no secondary shadows, somewhat good performance, but it doesn't look nearly as good, VXGI is superior
Imperfect Shadow Maps - extension of RSM with secondary shadows, insane fillrate; beats VXGI in some cases
Light probes - very good looking when mixed with light maps; the downside are lightmaps (precomputing!), beats VXGI for static scenes (when with lightmaps)
Path Tracing/BDPT (realtime) - ground truth for me (I did multiple GPU implementations - mega-kernels and batched), getting this run in realtime is real challenge (requires good GPU ray tracers)
Realtime Radiosity - a sort of lightmap solution that is updated progressively each frame (e.g. you update part of your lightmap), it works to somewhat semi-dynamic scenes with low resolution maps

The downside of any lightmap or radiosity solution is that you need to somehow unwrap your scene into actual lightmap - which I always found to be somewhat problematic.

I'm often deciding like this (when picking GI solution for project):

Is your scene static - then use lightmaps (or sort of lightmaps)
Is your scene mixed of static and dynamic - light probes and lightmaps, if possible
Is your scene dynamic and small - VXGI/ISM
Is your scene dynamic and big - Impossibru ... you either think off a good way to fake it, or avoid GI (although note that for big dynamic scenes and performance - GI will most likely be your last problem)

On 4/23/2018 at 7:54 PM, JoeJ said:
If you decide to go into GI research now, you likely waste your time.

This depends on what your goal is.

If it is to actually research GI - then it's definitely not wasted time. If you just want to implement them to compare, then well... unless you implement other techniques as a hobby (in which sense - you voluntarily pick to waste your time this way) or from professional point of view (i.e. to have something else to compare your research against for exactly same given conditions) they yes - you will most likely waste your time.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

JoeJ

4,399

April 25, 2018 07:23 AM

7 hours ago, Vilem Otte said:
Noise - clearly visible, this can be solved (to some extent) with temporal filter

Thanks for the video showing the flickering caused by popping voxels - usually people only show scenes where issues do not happen, but showing them helps a lot more.

Let us know if you ever implement (or improve?) the temporal filter. (Thinking of what's currently going on with denoising it should be really possible i agree)

Personally i did not have those issues when experimenting with voxels because i made a prefiltered volume. Instead of using rasterization i accumulated surfels (calculated from light map texels at the surface) to the 8 closest voxels. This was fast as well, but i did it only on single threaded CPU. GPU would requiere atomics to global memory. Not sure how this compares to voxelization by rasterization performance wise. Would be interesting to try.

Although i've got high quality data from this (i also used SH2 or SH3 per voxel), my main problem was even unpractical SH3 was not good enough to approximate directional occlusion well enough at higher mips - too much leaking. I did no cone tracing, but a volumetric light diffusion algorithm like this https://www.youtube.com/watch?v=YQuv3Myc9_M and some other stuff. Today i reconsider voxels and cone tracing for sharp reflections.

Some questions:

Do you also support clipmaps for larger scenes? How practical is it?

Experience with lit room beside dark room problem?

7 hours ago, Vilem Otte said:
The downside of any lightmap or radiosity solution is that you need to somehow unwrap your scene into actual lightmap - which I always found to be somewhat problematic.

Yes, i assume it's the reason why many people consider VXGI although they work in a mainly static game.

7 hours ago, Vilem Otte said:
Reflective Shadow Maps (RSM) - no secondary shadows, somewhat good performance, but it doesn't look nearly as good, VXGI is superior

IIRC Cryteks approach is to use RSM in combination with visibility traced from a voxel volume - solves the memory problem.

Agree with all you say - should be helpfull to decide for a GI solution.

Really nice and impressive work!

Vilem Otte

3,390

April 25, 2018 10:21 PM

14 hours ago, JoeJ said:
Let us know if you ever implement (or improve?) the temporal filter.

I have implemented basic first-order temporal filtering. It is as simple as:


next_value = prev_value * alpha + curr_value * (1.0 - alpha)

Here is a video showing different alpha parameters (0.5, 0.9 and 0.99) - I just went ahead, recorded it and uploaded it (I intentionally left just bottom plane of sponza with collider - so boxes can intersect with sides of the scene):

This requires additional input though - parameter which has high impact on flickering reduction, but also impact on how outdated the other information for filtering is. This is especially visible with 0.99, as the GI just doesn't seem accurate for given frame!

14 hours ago, JoeJ said:
Today i reconsider voxels and cone tracing for sharp reflections.

See the reflecting sphere, it works to some extent - but ray tracing reflection is vastly superior (with additional huge problem - acceleration structures for dynamic scenes, even though I have high performance GPU ray tracer - it's good only for static scenes). The problem with cone tracing and sharp reflections is resolution of your voxel buffer.

Using sparse octree helps a lot there (but the construction and actual cone tracing are quite heavy on performance - yet compared to ray tracer, doable for dynamic scenes).

14 hours ago, JoeJ said:
Do you also support clipmaps for larger scenes? How practical is it?

I don't, although for a while I consider adding it - especially as I use Direct3D 12, and you have Tiled resources there (which is a possible huge advantage).

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

JoeJ

4,399

April 26, 2018 04:35 AM

6 hours ago, Vilem Otte said:
Here is a video showing different alpha parameters (0.5, 0.9 and 0.99)

Thanks! Makes it pretty acceptable to my eyes

I already have a tree of surfels, so most likely i'll trace this directly - voxelization would only make sense if i want some volumetric lighting as well...

evelyn4you

Author

April 26, 2018 09:16 AM

On 25.4.2018 at 2:21 AM, Vilem Otte said:
Anisotropic voxels - not worth it. Increasing resolution is easier way around, with better results which doesn't increase computational complexity.
Static vs. Dynamic - generally pre-computing static data and then inserting dynamic is worth it. Requires additional 1 bit flag though.

So, you dont recommend Anisotropic voxels, better inreasing resolution.

A. cone Tracing (in the MipMap implementation) is actually mainly sampling the 3d MipMap.
a. when tracing smaller cones angles the result will be better but the tracing time i expect will grow MORE than linear

b. when tracing a higher resolution grid/3d MipMap the results will also be better but the memory is inreased by factor of 8 which is very much

Did you test the tradeoff from both variations. E.g. 4 cones vs. 5, 6, 7 quality improvement vs rendertime, the same about grid size ?

As i understand correctly the cone angle here is only a virtual/mathmatical formula how geometrical far the next sample step will be from the previous sample is an in which direction this step goes. ( centers of spheres that touch each other in the cone)
This means that the smaller angles will result in more steps to trace, because the geometrical step sice is smaller. Is this understandig OK ?
How does the timings increase in a pratical scenario.

B. do you sample every single point in screen space or do you sample only every n th pixel which decreases render time with factor 4

If you do so, how do you reconstruct the image ? A naive interplation would blur out everything, so there must be a kind of a filter of
surface discontinuity detection to distinguish intentionally sharp contrasts from contrasts that shall be blurred out because of
undersampling.

C. On the other hand the whole GI process could also be done in a e.g. quarter screen resolution and finally in the light gather pocess be upscaled, an blended with the direct illumination
Did you maybe test this ?

Many thanks for your help

matt77hias

560

April 26, 2018 10:05 AM

50 minutes ago, evelyn4you said:
C. On the other hand the whole GI process could also be done in a e.g. quarter screen resolution and finally in the light gather pocess be upscaled, an blended with the direct illumination
Did you maybe test this ?

Quote
Fourthly, I perform the cone tracing pass at half resolution of the screen resolution. However, this result in visible artifact when up-scaling to full resolution especially at the edge of the geometry (the strength of the indirect lighting in the following screen shots are increased to show the artifact more clearly). (source: http://simonstechblog.blogspot.be/2013/01/implementing-voxel-cone-tracing.html)

50 minutes ago, evelyn4you said:
Did you test the tradeoff from both variations. E.g. 4 cones vs. 5, 6, 7 quality improvement vs rendertime, the same about grid size ?

You want to span a hemisphere (cosine-weighted). Crassin himself used 5 cones for the diffuse contribution and achieved good results. For specular and perfect reflective contributions, you'll need extra cones as well.

🧙

implement and understand voxel cone tracing

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

implement and understand voxel cone tracing

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines