Advertisement

DirectX12 adds a Ray Tracing API

Started by March 19, 2018 11:22 PM
40 comments, last by NikiTo 6 years, 7 months ago

Finally the ray tracing geekyness starts:

https://blogs.msdn.microsoft.com/directx/2018/03/19/announcing-microsoft-directx-raytracing/

 

lets collect some interesting articles, I start with:

https://www.remedygames.com/experiments-with-directx-raytracing-in-remedys-northlight-engine/

Honestly I'd like to try it - what I'm skeptical about are dynamic scenes (in my thesis I was doing interactive bidirectional path tracing in CUDA - and it worked well... for static scenes, re-building acceleration structures was simply too big performance hit).

To also add something (and not just spam around) - http://forums.directxtech.com/index.php?topic=5860.0

 

Has anyone successfully ran it?

 

I have just Rx 480 (and some older Radeon 290) available at the moment - which is a no-go due to Shader Model 5.0 only support... this might be hint to get up to date with some hardware.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Advertisement

Welp, here's about all you need to know: https://www.remedygames.com/experiments-with-directx-raytracing-in-remedys-northlight-engine/

"Single Ray Per Pixel, 5ms @1080p on very high end graphics card, single sample termination, ambient occlusion with geometry sampling only" Which of course is really noisy so then you get to add denoising overhead on top of that. Oh and it's all static too.

So, yeah performance is definitely not realtime for today and probably not tomorrow and next gen either. Really don't understand why DirectX needs it's own raytracing API in the first place. 

2 hours ago, FreneticPonE said:

Really don't understand why DirectX needs it's own raytracing API in the first place. 

Download the SDK and check out the docs to see the problems that it's solving: http://forums.directxtech.com/index.php?topic=5860.0

Basically, an RT API is a great stress-test for modern compute-based GPUs. You want things like:

  • Dispatch calls that can dynamically produce more dispatch calls, recursively
  • Dispatches that can pre-empt themselves with calls to a different shader, before resuming.
  • Being able to bind an array of compute shaders to a dispatch, and have them refer to each other via pointers.
  • Being able to bind many root-parameter sets to a dispatch.
  • A very smart scheduler that can queue and coalesce these different shader invocations into large thread groups before executing them.

Current GPU's are almost here, but even with DX12/Vulkan, this kind of stuff is impossible. The API described in the SDK preview is trying to solve these problems.

They're also promising to put a full compute-shader based fallback implementation up on GitHub, which is amazing! ...and then any apps written against this common API will be able to be accelerated further as the HW vendors move more and more of it over to their next generation of front-ends / thread schedulers.

NV already has hardware that can do this stuff, so Intel and AMD are under pressure to catch up with more flexible shader thread scheduling. Also, NV don't like being locked to MS, so you can be certain they will releaseGL and VK extensions soon enough... at which point Khronos will feel the pressure to standardize these features too.

Maybe in the long term, having standard RT API's is a silly idea... but in the short term, this is going to have a very positive effect!

Thanks, both of you!

 

@Hodgman I've been near ray tracing business (especially GPU one) for some time. For few years (when working on my BSc. and MSc.), and since then as hobbyist. Which is why I'm so interested.

Sadly, I doubt I'll be able to run it anywhere (I'm running only AMD GPUs as of now - having Shader Model 5.0 only - and buying a new card around is not really an option, as everything that could be used is literally sold out around here ... damn those miners!). But maybe in few weeks, when supplies arrive?!

I'm very keen to see RT api though (which is why I'm very interested - and it really won't allow me to sleep today! ... 6 AM and still digging around that code!), we've had some approaches but mostly you either had to write whole traversal + shading kernels, which were huge. (Or storing results in buffers & shade after).

From what is in the presentation by @FreneticPonE the results doesn't really impress me (yet it's hard to compare with anything when all I see are just few slides).

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

45 minutes ago, Vilem Otte said:

buying a new card around is not really an option, as everything that could be used is literally sold out around here ... damn those miners!). But maybe in few weeks, when supplies arrive?!

Keep in mind that the only GPU at this time which can run all this natively is a $3k MSRP Titan V. Might want to hold off for a bit.

3 hours ago, FreneticPonE said:

Which of course is really noisy so then you get to add denoising overhead on top of that

The denoising is pretty crazy though. I'm assuming RTX sinks at least some support for Morgan's technique into the hardware layer.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Advertisement
9 hours ago, Vilem Otte said:

Honestly I'd like to try it - what I'm skeptical about are dynamic scenes (in my thesis I was doing interactive bidirectional path tracing in CUDA - and it worked well... for static scenes, re-building acceleration structures was simply too big performance hit).

Did you try the idea to use static trees per model, just transforming them (refit for skinned meshes), and rebuilding  just a smaller tree on top of it all to link it together?

I do so on GPU and performance cost is negligible, but i have lower requirements on quality.

Recently i've read a paper about Brigade and they do the same using a scene graph async on CPU.

4 hours ago, Hodgman said:
  • Dispatch calls that can dynamically produce more dispatch calls, recursively
  • Dispatches that can pre-empt themselves with calls to a different shader, before resuming.
  • Being able to bind an array of compute shaders to a dispatch, and have them refer to each other via pointers.
  • Being able to bind many root-parameter sets to a dispatch.
  • A very smart scheduler that can queue and coalesce these different shader invocations into large thread groups before executing them.

Yummy! I hope this becomes available for general compute APIs as well, will take a look... :)

 

EDIT: Do you know if one can keep the data in LDS somehow while 'switching' shaders?

(There is no description, just the SDK and i don't want to install that just to read right now...)

 

 

The video seems not super impressive for the effort. The need to denoise even just for specular reflections is somehow disappointing - looking a alternatives like voxel cone tracing i still feel the need to work on better methods / algorithms / data structures. Of course with faster hardware this will work, also for pathtracing, but i still see pathtracing as a simple but slow solution.

Edit: It's more impressing after reading the power point. They do a lot more than just reflections - not anything else is baked :)

 

I think it's a very promising step in a good direction. There's so many rendering techniques that relay on ray-tracing idea (SSR, GI, SSAO SSVGI, Screen-Space Everything) and a low-level API will help to achieve performance.

I wonder if it could be used to shoot some rays before scene rendering to replace Hi-Z PrePass for the geometry culling. Do you think it could quickly calculate the scene objects visibility?

Finally, of course NV always had been one step forward in terms on new rendering features :D They like to put some pressure on others.

Flax Game Engine - www.flaxengine.com

5 minutes ago, mafiesto4 said:

I wonder if it could be used to shoot some rays before scene rendering to replace Hi-Z PrePass for the geometry culling. Do you think it could quickly calculate the scene objects visibility?

I don't think so. One major problem of rays is they have no area, so you always need too much of them to approximate something. For robust object visibility you need one ray per pixel. Rasterization is likely to be faster as long as your scene can be approximated well with polygons. (But object visibility means just visibility from one point, while GI means to solve for visibility from each point to each other. So this adresses a much larger problem.)

This is major big news and I can't wait to try it out. Hope we get to see those changes reflected in SharpDX not too long after.

This topic is closed to new replies.

Advertisement