Advertisement

Combining material shaders with other shaders

Started by July 14, 2022 01:28 PM
2 comments, last by MJP 2 years, 6 months ago

This is a question I've had for aaages, but I haven't been able to find a proper answer. These days, a lot of engines have material shaders that have incredible complexity, often using node-based editors. While not trivial, it seems pretty straightforward how these can be turned into compiled pixel shaders. However, pixel shaders are also often used for lighting and other effects the engine uses that are applied after the material is calculated. I have absolutely no idea how to combine the two.

I could for instance, pre-render the material shader onto an offscreen surface, and then use that as a texture input for a polygon using the engine's shader with all the lighting and whatnot code, but this seems excessive to do for every shader. Plus if the shaders use any animation, they would have to be rendered offscreen each frame.

I've also considered using multipass effect files. Rendering the material in one pass, then doing lighting calculations in a second pass, but as I understand it this can create artifacts with anti aliasing.

Finally, I could compile the material shader into HLSL, and just write in my lighting code at the end of the material HLSL code.

I've tried to research this, but I can't find an answer as to how this is commonly implemented. I've even downloaded a few open source material editors to browse through their rendering code, but I struggled to find one in C++, which is really the only language I know.

Hope that makes sense.

Tooko13 said:
Finally, I could compile the material shader into HLSL, and just write in my lighting code at the end of the material HLSL code.

That's the common solution for a traditional ‘forward renderer’.

Tooko13 said:
I could for instance, pre-render the material shader onto an offscreen surface

That's the common solution for a 'deferred renderer'.
Multiple material properties (roughness, metalness, aldebo…) are rendered to a GBuffer, often fat, requiring some compromise to achieve compression.
After that the lighting pass reads those properties to apply the lighting, respecting the material.
The advantage is decoupled visibility and shading - we only shade the pixels which are actually visible. A forward renderer lights all pixels, even if they are overdrawn form other triangles.
Because there is no support to handle transparency, most deferred renderers also use a forward pass to handle transparent objects.

The two concepts can also be mixed, e.g. using the ‘Forward+’ approach.

There's plenty of related literature, tutorials, etc. Should be easy to find. Actually i wonder why you ask, so maybe i got the question wrong.

Advertisement

As JoeJ mentioned deferred and forward are the two main categories that engines typically use, although there's a lot of variations within those two broad categories. Forward can be simple since you do everything in one big monolithic pixel shader, which cans save you some headaches and keeps everything “on-chip” rather than having to write things out to memory. If you structure your code correctly you can also cleanly separate the “engine” part of the shader from the “material” part of the shader. However there are numerous downsides:

  • Big pixel shaders can have poor performance due to high register usage and instruction/L1 cache thrashing
  • You can end up generating a ton of shader permutations due to having too many things in one shader . On top of that big shaders take longer to compile.
  • As JoeJ mentioned you may need a depth prepass to avoid shading pixels that aren't actually visible
  • Since pixel shaders always execute in 2x2 quads, your performance will suffer if you have high triangle density due to quad overshading
  • It can be tough to optimize your shaders since you have to do all lighting steps in a single monolithic shader

Deferred approaches can sidestep a lot of these issues by effectively splitting things up into multiple phases: you can split your different lighting sources into totally separate shaders and passes, quad overdraw and hidden surfaces is not an issue, you can use async compute to overlap the shading with other work, and you can end up with far fewer permutations. Having a G-Buffer also makes certain techniques more viable simply from having the data available: for example SSAO, screen-space reflections, and various ray tracing techniques. The main trade-off is that you now have to spill data to memory, which is not free in terms of memory consumption and performance. The various flavors of deferred will take different approaches to what gets spilled, which changes the calculus. The other trade-off is that you lose some nice conveniences from doing everything right in a pixel shader: in particular anything involving implicit derivatives becomes more difficult, and MSAA is a royal pain to get right (most engines don't even bother with it).

Another third option, which I think you were describing in your post, is to “cache” your evaluated materials in texture space and then sample that “flattened” texture when rasterizing and shading your scene geometry. This has definitely been done, but it's tricky. I've mostly seen it done in the context of virtual texturing, which helps to manage the memory requirements of having to allocate unique texture data for every surface. See this presentation and this one for references. It's viable, but unfortunately there's a ton of details to get right to make work: you need unique UVs for every surface, you need to determine which surfaces are visible and at which mip level in order to manage and update your cache, you ideally want to compress the generated results to a BC Format at runtime, you need to figure out how to generate the cache efficiently, etc. Another related approach is to completely pre-compute the “flattened” textures offline and ship that data on-disk. This is effectively what RAGE did with MegaTextures. That avoids a few issues but introduces some new ones: you need to somehow store all of that data on-disk and stream it off efficiently, and then you probably still need to compress it at runtime. To date I don't think anybody else has gone down that route.

This topic is closed to new replies.

Advertisement