Advertisement

On what GPUs the code of unused vertex interpolators will be deleted automatically?

Started by April 04, 2024 02:36 PM
5 comments, last by betauser 8 months, 3 weeks ago

Hi, does anyone know on what GPUs interpolators will be removed from the vertex shader if some part of them is not used in the pixel shader during linking? Unused code be removed from the vertex shader for unused interpolators?

What hardware does this? Which ones don't?

In the very limited amount of documentation about GPUs that I have read some years ago, I understood they are often SIMD systems. That is, a single instruction stream shared by all processors. In such a setup, “delete for some” isn't feasible at all.

For better answers, you should probably read about the global hardware design of various GPUs.

Advertisement

Alberth said:
That is, a single instruction stream shared by all processors. In such a setup, “delete for some” isn't feasible at all.

This should not matter, since most likely an entire CU / SM is filled with the same draw call.

Personally i do not think they analyze pixel shaders to optimize former vertex shaders in the current pipeline. Since API and GPU scheduler can not know which combinations of shaders will be used during the frame, the optimization would need to happen per draw call at runtime, requiring to modify code that has been already compiled.
Since this is usually resolved using shader permutations, which is the responsibility of devs not drivers, related optimizations done by driver / on GPU would be redundant and cause more harm than good.

But maybe using GPU profiling tools would help to clarify. Ideally they show which interpolators are active for a draw call, and compiler output as well.

I mean code which dead after vertex fragment linking. Thanks for answers, but maybe someone know more about internal part of drivers 🙂 What real situation inside

Graphics drivers can perform optimizations around removing vertex shader outputs that aren't used in subsequent pixel shaders, but the story is complicated. Up until Vulkan most API's didn't actually force the user to specify up-front what vertex+index shader combinations will be used, so the driver has no way of performing such optimizations until the point at which everything is bound and a draw actually happens. This might be undesirable for a handful of reasons, but isn't impossible and probably does happen under the hood. AFAIK no hardware performs such optimizations - it just executes the bytecode that the driver sends it regardless of whether things could be more optimal.

That being said, some API's do allow the user to perform such optimizations up-front when compiling the shaders initially. You effectively describe all the combinations of vertex+index shader that you'll use up-front and “link” them offline, allowing the compiler to optimize out unused vertex outputs. In fact LunarG (the maintainers of the most popular Vulkan SDK implementation) recently released a white paper describing this exact process for Vulkan, though it is fairly clunky.

Other more low-level API's support this concept as well, e.g. the PlayStation proprietary shader compiler.

Interesting, maybe also the driver optimizes most popular pairs during time. Internally it can be a hash table

This topic is closed to new replies.

Advertisement