![incoherentresults.png](https://picload.org/image/roaacgac/incoherentresults.png)
Path tracing benchmark
![incoherentresults.png](https://picload.org/image/roaacgac/incoherentresults.png)
Quote from the other forum: "to make it short: using coherent ray traversal, knl can render most of the scenes i have around stable below 1 ms into a 1024x1024 frambuffer."
This means only primary rays, yes? We know both raytracing or rasterization is fast enough for that, so how is runtime for 64 secondary diffuse rays?
And some screenshots (or even a video) would be very nice and interesting, pictures are better than numbers :)
Quote from the other forum: "to make it short: using coherent ray traversal, knl can render most of the scenes i have around stable below 1 ms into a 1024x1024 frambuffer."
This means only primary rays, yes? We know both raytracing or rasterization is fast enough for that, so how is runtime for 64 secondary diffuse rays?
And some screenshots (or even a video) would be very nice and interesting, pictures are better than numbers :)
I agree that a few pictures would be nice, to that end I think this is his website: http://rapt.technology/
...so how is runtime for 64 secondary diffuse rays?
And some screenshots (or even a video) would be very nice and interesting, pictures are better than numbers :)
look at the chart above. it is the total runtime for camera rays + 64 random diffuse rays. more than normally used in realtime pt.
video, paper etc. are on the way...
Thanks, the numbers are very impressive. I'm surprised.
FYI, i work on realtime GI and counting my rays (never did before) i get > 300 MRays/s too with a FuryX, but i trace against a tree of discs not triangles and my scene is simpler than Sponza.
Also my algorithm is quite different and we can't compare this, but i'm sure you can squeeze out more from GPUs than your graphs show (the vendor may matter).
Too bad Larrabee did not made it to a consumer product, but probably Intel will hold this back until time is right.
Personally i'd already agree to remove the hardware graphics pipeline from GPUs for more compute power.
Keep us up to date :)
...so how is runtime for 64 secondary diffuse rays?
And some screenshots (or even a video) would be very nice and interesting, pictures are better than numbers :)
look at the chart above. it is the total runtime for camera rays + 64 random diffuse rays. more than normally used in realtime pt.
video, paper etc. are on the way...
Well, let's do some math. Coherent rays doesn't matter for real-life applications, so I'm skipping the 2.5 billion rays / s benchmark. Tho, 1 rays + 64 secondary rays with 2.5 billion leads to 18 fps @ 1080p with primary and secondary sampling alone. No texture sampling, no shader graph execution.
IMHO, only the san miguel scene matters with incoherent rays, everything else is too simple compared to an average game. That's 100M rays/s tops, at 60 fps@1080p, that's 0.8 rays / pixel. Even with temporal anti-aliasing, that's not enough for anything. Besides this, shader execution not only breaks any of your hope for batching, but also is going to be more costly than raytracing itself. In raytracers, tracing rays is usually the cheaper operation, compared to the execution of a shader graph. So let's say, that' halves your ray budget, going down to 0.4 rays/pixel. You also want to do game logic, rebuild bvh for animation, physics etc... Another halving. So we are down to 0.2 rays/pixel on a 5k GPU.
shaken, not stirred
Well, let's do some math. Coherent rays doesn't matter for real-life applications, so I'm skipping the 2.5 billion rays / s benchmark. Tho, 1 rays + 64 secondary rays with 2.5 billion leads to 18 fps @ 1080p with primary and secondary sampling alone. No texture sampling, no shader graph execution. IMHO, only the san miguel scene matters with incoherent rays, everything else is too simple compared to an average game. That's 100M rays/s tops, at 60 fps@1080p, that's 0.8 rays / pixel. Even with temporal anti-aliasing, that's not enough for anything. Besides this, shader execution not only breaks any of your hope for batching, but also is going to be more costly than raytracing itself. In raytracers, tracing rays is usually the cheaper operation, compared to the execution of a shader graph. So let's say, that' halves your ray budget, going down to 0.4 rays/pixel. You also want to do game logic, rebuild bvh for animation, physics etc... Another halving. So we are down to 0.2 rays/pixel on a 5k GPU.
Disagree. That's all valid numbers, but if we change some things it's possible with current hardware:
Raytracing at half resolution is enough. With 300 MRays/s he needs 10 ms for this.
If we do object space lighting and pre transform vertices (which we'll likely do anyways) pixel and vertex shader becomes close to pass through, we can do the same for ray hits:
If we have some LOD of object space lighting for the whole scene and accept diffuse only at the hit point of a reflection ray that's a single texture fetch and to further material dependent execution.
(That's exactly what i'm doing + it's possible but not yet sure for 60 fps at consoles. My results are low frequency but have infinite bounces, they have some lag but are temporal stable.
I'll also add 4x4 enviroment maps to remove the diffuse only limitation - with this there is a fallback for everything so rays can be shortened, getting closer to coherent ray performance)
That's personally biased, debatable and skips a lot of details, but i think we would get down to a maximum of 4 ms for raytracing (+ another 4 ms for object space lighting). Still time to upscale and post process.
Now if we do temporal aliasing and use 16 instead 64 rays - that's 1 ms for tracing, so we don't necessarily need a Phi)
Rebuilding BVH is not worth to mention, it's a preprocess. For animation we can easily refit bounding volumes and rebuild just the top levels of the tree.
Game logic and physics will stay on CPU as usual and has no effect on rendering.
If no triangle transforming is required, optimized and pre-generated data-structures can be used.
Furthermore you can tighten the opening angle from full hemisphere to something smaller. Both will give
you quite another boost on top. using a ray-budget distribution of 4x16 will increase the quality at the same time.
The benchmark above is just a worst-case brute-force approach to measure incoherent ray transport performance
on different hw architectures. Using 4 of the intel knls should give us the possibility to display some scenes in photo-realistic
quality what was not possible before.
mp
just a quick update: we did some tests on intels knights mill. to make it short: the machine is boring.
no perf. progress at all. for graphics intel seems to be a dead end. gpus will dominate the next years.
so all the cpu stuff was wasted time.
mp
10 hours ago, mpeterson said:just a quick update: we did some tests on intels knights mill. to make it short: the machine is boring.
no perf. progress at all. for graphics intel seems to be a dead end. gpus will dominate the next years.
so all the cpu stuff was wasted time.
mp
Just so you know intel is making there own GPU division now. (they hired Raja Koduri, former head of RTG(Radeon Technology Group)) of AMD). Also IIRC they canceled the next Xenon Phi, but the one after that is still on the roadmap.
-potential energy is easily made kinetic-