fleabay said:
My use of ‘low-level’ was meant as directing using the APIs, not indirectly as using a framework or engine. Your quote here makes it sound to me like one could only use RT on DX11/GL if directly using those APIs. Does your meaning of low-level differ from mine?
I'm totally confused now as well : )
What i meant is simply: DXR can be used only from DX12 / VK (low level), but not at all from DX11 or OpenGL (high level).
It seems neither Microsoft nor NVidia have interest on updating high level APIs.
And this makes sense, because after all legacy game engines have made the transition to low level, the only people that keep using high level APIs are hobby programmers / newbies for learning purposes.
Also, game engines increase in complexity but decrease in number. Big engines that make the real money are maintained by a small amount of experts, and they take the burden of working with low level APIs.
But that's just my personal opinion from the outside. I do not work in game industry.
fleabay said:
Are you seeing this with RT specifically?
RT is part of my project, but that's not the point. My project is realtime GI and it's all compute. Neither RTX nor rasterization hardware is used. My experience is also restricted to compute for most, i can't say much about rendering.
So, moving from OpenCL (which was faster than OpenGL compute shaders) to Vulkan gave me a speedup of two. That's much more than typical games get from the transition to low level API.
The reason is with VK i can avoid slow CPU - GPU communication, by prerecording the whole program flow just once, uploading to GPU and then per frame i only need one CPU command to execute the program.
With OpenCL i had to download a small result to CPU just to setup the size of the next workload, and i had to do this for each single shader. OpenGL allows indirect dispatch so the download is not necessary, but i still need to dispatch each shader with one CPU command. (That's much better but in practice OpenCL still was faster, surprisingly the most on NV GPUs.)
So this gave me that big speed up on the GPU side because there is no waiting to get work from CPU.
For typical game workloads it's mostly multithreaded command buffer generation and async compute that gives benefit, i guess. Former for CPU and latter for GPU.
I have to add performance is the only reason to move to low level, IMO. If you have no performance problems, then i would not wonder if you end up just pissed. VK needs 10 times more code than my OpenCL version. It's tedious, having to care about each tiny little thing manually. And rendering is much more complicated than simple compute. : /