Vilem Otte said:
You don't need to rebuild acceleration structures at any point, you don't need to manually build it for objects upon adding (internally the api takes care of it), you don't need to explicitly refresh them per object (internally the api takes care of it). This is almost all the time hidden from the user.
Basically that's true for DXR as well. Maybe your's provides more abstraction by even doing a rebuild automatically, but it's exactly such abstractions which prevent me from using it.
And i did not even see that coming, initally, after RTX announcement and looking up the first release of DXR. I have criticized RTX from the start, because i felt it prevents us from doing further research to come up with efficient raytracing. Only NVidia could do this from now on, since they build the fixed function units, and nobody else has access. Thus nobody has reasons to work on raytracing anymore, and progress would stagnate. (You do it anyway, which is crazy, but you earn my respect ;D )
Only a bit later i realized i'm actually right. But the problem sits way deeper than being blocked from solving the random memory access problem of a traceRay function…
Years before, people often said: ‘Raytracing is not realtime, because building the acceleration structure is too expensive for complex scenes. You can't do that each frame, if the scene is dynamic.’
Then i replied: ‘Bullshit, dude! This can be solve easily: You prebuild BVH for all your models offline. At runtime you do this per frame: Refit BVH for characters, then build a small BVH over all your models. This won’t be many, so building those top levels is little work and totally realtime.'
And this is exactly how DXR deals with the problem as well. Nothing wrong with that? Actually nice?
No. It's not enough. We forgot to think about LOD, which beside visibility is one of the two key problems in computer graphics.
And we have good reasons to have ignored LOD, because we have a rich history of being incompetent to solve it. Actually that's caused by GPUs. Shiny had their Nanite already in 2000, and we had dynamic LOD algorithms for terrains as well. But it was not efficient to upload geometry to GPU each frame. It was faster to be silly and lazy, using more triangles than needed, but uploading them only once and letting the GPU brute force the problem away.
So we stopped working on LOD at all. We accepted a non solution of discrete LODs to be good enough. And decades later, Epic proofed us all fools by shockingly showing off what we have missed out, without even noticing.
At this point we should realize that we took the bait in form of GPUs to lure us into the wrong direction.
Graphics programmers were once known to be optimization experts, creative and innovative, never hesitating to work on hard open problems, like hidden surface removal.
But then, after GPUs? No more creation of portals while rendering front to back as seen in Quake, no more bump mapping as seen in Outcast, no more progressive LOD as seen in Messiah.
We lost ability to tackle open problems, and became close to the metal low level optimizers. To catch up with progress, all we did was reading the latest NVidia papers, which teached us how to do brute force most efficiently.
We took the bait and we were grateful in accepting the only way forward is fixed function, and increasing teraflop numbers. Helping them to sell bigger, bigger, and even bigger GPUs.
And gamers took the bait too. They believed in the graphics gods at NV just as much as we did. And to afford bigger and bigger GPUs, they became miners.
What a sad story, no?
It's the fucking truth.
But now back to my problem, proofing why awesome DXR prevents us from solving the LOD problem.
We take Nanite as an example.
Nanite has a BVH for it's meshes, because obviously LOD is a problem of hierarchical detail. Refining detail means to descend a branch of a tree. Reducing detail means to stop at an internal node. Each node stores a patch of geometry at its certain level of detail.
Do they calculate this hierarchy on loading the highest resolution model? As a nicely abstracted background process? Taking just a minute of processing time? Just like DXR does for its BVH?
No. Of course they precompute this, and load only from disk what they actually need, without any back ground processing. It's not that they are totally stupid.
DXR isn't stupid either, one may think. They need the abstraction so every GPU vendor can do his own custom BVH format suiting its HW. I see that.
And the price to pay simply is: We on PC build our BVH on GPU at runtime. During the game, each time we stream in some new stuff. This is stupid. But let them just pay premium for big GPUs. All those cores need some work. PC master race can afford to spend some cycles. They have plenty of it. No matter how much redundant work they do, they will still have one FPS more than PS5, which really is all that matters to them.
So far so good. What's my problem then?
The problem is: Once your mesh exchanges a small patch of its geometry with a lesser or higher detailed version, it's topology changes.
This breaks the DXR BVH. It has to be rebuild from scratch. You can not refit it, you have to rebuild it completely, because of this gradual change any progressive or continuous LOD solution requires to work.
Notice: As we move through the scene, all your models will change some sections of their mesh to fit detail to screen. That's the idea of a proper LOD solution. Any LOD solution, not just Epics.
Result: Your whole scene changes. You need to rebuild BVH from scratch for your entire world. Complete rebuild is the only option DXR provides for this case.
Can your awesome RTX 3090 do this? No. Not even 10 of them in a mining rig could do this in time.
Can you shove your shiny RTX 3090 up your ass? Yes. And that's exactly what you should do with it. I would. I never requested Tensor Cores. No game dev did. We can do temporal upscaling without ML, UE5 is again a good example to proof this. And RTX i can not even use, because the self appointed experts at NV and MS headquarters were too busy by getting high on their own farts from porting Optix to DirectX, and they forgot about LOD due to incompetence or ignorance.
To allow us to solve this problem, we need one of two options:
1. GPU vendors expose their BVH data structures with vendor extensions. So we can build and modify it ourselves. We are not too stupid to do this, even after decades of taking the bait.
2. Make a BVH API to expose it by abstractions all GPU vendors can agree. That's difficult maybe.
None of these options will happen anytime soon. It will take years, more likely a decade.
So i ask you: Does DXR spur progress? Or does it dictate stagnation?
It's the latter. The most efficient way to do raytracing over my geometry will be a compute tracer. Slow tracing, but the cost of BVH building is zero.
But i will not trace triangles, just surfels. So again no sharp reflections or hard shadows. As my GI already provides environment maps to look up specular, i'm not sure if it's worth it at all.
It's crazy. But i think LOD is more important than raytracing, and they force me to choose between the two.
Some people now will ask: ‘Hey, but UE5 does use RTX! They even migrated their whole Lumen stuff to HW RT, so why do you say they can't use it with Nanite?’
The answer is, they do quite the same compromise: They use a static low poly version of their models for tracing without LOD. But they can not generally trace the awesome details they can rasterize. And they also need to build BVH for RT on GPU for those low poly models, although they could eventually just convert their own format. Brian Karis also criticized DXR for its shortcomings.
Now i hope some more companies will do their version of Nanite, so people finally understand the broken state of HW RT and request solutions.