Advertisement

Scalable shader to blend multiple materials in general form

Started by August 24, 2024 10:20 AM
10 comments, last by kotets 2 weeks, 3 days ago

Hi folks,

My first post here as part of my journey to learn gamedev I started ~5 months ago.

I'm learning GLSL/shaders at the moment and trying to do a seemingly simple thing, but not sure what the best practice here is, not to reinvent something.

Task

Shader that blends between materials (up to 6 in any single pixel anywhere), but has no limit on number of total possible materials

  • Imagine 256 materials, defined in their own shader graphs / GLSL files
  • Each vertex has only a single material assigned. Always 100%, never a mix at the vertex position itself.
  • Each vertex has neighbors with a maximum of N other different materials
    • Most cases — one or two;
    • Extreme cases: up to 6, in 0.000001% of cases
  • Where materials change — I want them to blend between each other smoothly, instead of just abruptly becoming different

---------

Many tutorials I've seen do the following: if you have 6 materials, they just suggest 6 channels, set them per vertex and then just mix ~6 varyings; All these tutorials start with “Ok, let's do dirt, grass and sand”, but then the solution doesn't seem scalable.

Channel/varying per material is trivial and obvious. But it doesn't scale: if talking about potential 256 materials — that sounds extreme, like imagine 256 channels and 256 varyings and overhead. And some sophisticated colony management game that would aim to have ~45 different metals and 14 biomes would easily have even more.

And also having a channel/varying per material doesn't utilize the fact that materials change only between neighbors. Each vertex = always 100% specific material, no blend at all, all blending — only if two direct neighbors have different materials. Any distant neighbor (i/j/k different by 2 or more) — never affect voxel at [i, j, k]. And all the blending needs to be just around the transition, I've attached the image above

And now imagine what an overkill it would be to have 255 uniforms always unused and passing 255 zeroes to GPU for each vertex in 99.9999% of cases.

---------

I'd love something similar to:

  • For each triangle: say “In this triangle, we are blending between stone/dirt/sand”
  • “Vertex 1 is stone, vertex 2 is sand, vertex 3 is dirt”
  • Blending logic is always the same, doesn't matter what it is, we can assume some smoothstep 0->1 or anything trivial.

So how do you usually do this?

  • Do I use 6 flat int varyings per vertex to map each of 6 channels to a specific material to tell the drawn triangle what material each verying is mapped to?
  • Or do I better make all existing combinations of blends as separate materials with different 6 uniforms that map channels to materials for this specific triangle? Given 99% of the map is single material — that sounds more optimal in terms of performance, but not sure.
  • Am I overcomplicating this and there are simpler tools?

Thank you!

  • Each vertex has only a single material assigned. Always 100%, never a mix at the vertex position itself.

What I mean is this, vertices are circles:

So at each vertex the material is 100% grass or 100% stone, etc.

Advertisement

kotets said:
So how do you usually do this?

I would bake it once, then render it each frame as a single texture as usual.
Why paying such a high cost on image composition every frame, although the result is always the same?

The baking could happen either offline (levels), or as a background task of the game (open world).
For the latter i would implement it with async compute shaders, expecting some additional complexity regarding memory management and texture compression. Tiled resources could help with memory management.

kotets said:
So at each vertex the material is 100% grass or 100% stone, etc. Like Quote Reply

Artistically this is a big limitation, because then we can not blend smoothly across a wider distance.
If we zoom out or look at the distance, all material transitions become sharp, which isn't natural.

But maybe you want it this way. Otherwise notice that baking gives much more options here, since we can do expensive and higher quality blending. To make that possible, one could allow 4 material indices and weights per vertex, for example.

However, i see my proposals are very complex compared to the simple system you have in mind.
So that just said to add some other perspective on the problem.

I'll see if i can help on your actual question now… ; )

kotets said:
What I mean is this, vertices are circles:

Ok, so you want a grid interpolation, similar to a texture, where each texel is it's own material.
And your vertices should represent the centers of the texels.

I think you overlook the actual problem with this idea.
Your example works, but only because all your vertices are 'regular'. (Regular means they have 6 edges for a triangle mesh.)

I'll draw an example using some irregular vertices:

Now try to figure out how the boundary of influence from the green vertex should look like.
That's not trivial, so i've drawn a cloudy circle.
But how do we do this exactly?
The simplest rule for a triangle mesh is: Connect edge centers to triangle centers. Which looks like:

It's not a nice shape because of the jaggies. Result would not look good.
And worse: If we generate texture UVs implicitly from the geometry, we get terrible texture stretching and tiling artifacts.

Applying this rule to your example of the regular mesh would not look good either, by the way.
But it would look good and work if your mesh was a quad mesh instead a triangle mesh.
But regular quad meshes can only do one topological shape: A flat patch of surface - e.g. a disk, or a quad.
If you are fine with that, what you really want is a height map data structure, representing a 2D regular grid. Then texturing is trivial, which is why most people use it for terrain.

So my question is: Do you want height maps, or complex geometry with holes, caves and overhangs, requiring a mesh?

If it's a heightmap, i would say think about grids primarily, not about triangles.
If it's complex topology, i can give some overview of options regarding implicit texturing.

Hi JoeJ,

Thank you for your response.

Sorry for not providing enough information, I was not sure how much is relevant. Let me give a more expanded explanation.

As my first game I'd like Dwarf Fortress / Rim World-like game, but 3D. Every voxel editable, everything generated. For terrain — given I'd like full control over one of the very core systems, I decided not to get a ready plugin like some Voxel plugins out there, but build my own. Learning + flexibility, a win-win.

All the data and every mechanic in the game actually operates on square cells of the same size, exactly like Dwarf Fortress. But I just wanted it smooth, not blocky voxels, so decided I need some Marching Cubes-like stuff.

So I'm studying the https://transvoxel.org/ paper. I've finished with the geometry part fully for both primary and transitional meshes, together with simple generation, SDFs, LODs. In a nutshell, I'm here at the moment of writing:

Now I started studying the materials in the paper and noticed that either I didn't understand the paper or it's as if Eric described how to blend between different textures on different planes of a triplanar projection of the same material, but not blending between different materials on the same plane. I tried to re-read it like 20 times, maybe I'm too much of a beginner that I miss something, but I'm getting an impression that blending between different materials is not mentioned + not shown in any image of the paper. Correct me if I'm wrong.

Thus I started studying this topic and spent several hours, felt stupid and decided to ask first, maybe it is solved. 😅

In the end, after I did a bit of my own research, I think that what I need is very similar to this solution in a popular voxel plugin.

Given I do not need weights, I thought about packing a bit more possible materials in one 32-bit byte vertex attribute:

  • 2 bits — number of materials (minus one) used by rendered triangle.
  • 2 bits — number of material of this vertex. Basically, by description I tried to provide above, on each specific vertex, its material is 100% and any other potential material is 0%.
  • 7 bits x 4 times — material index for each blending channel. Allowing 0-127 different materials.

Then it is passed into the vertex shader that can do something like:

flat varying int vMaterialId0;
flat varying int vMaterialId1;
flat varying int vMaterialId2;
flat varying int vMaterialId3;
void main() {
 uint currentMaterial = (aMaterialData >> 28) & 3u;
 vMaterialId0 = int((aMaterialData >> 21) & 127u);
 vMaterialId1 = int((aMaterialData >> 14) & 127u);
 vMaterialId2 = int((aMaterialData >> 7) & 127u);
 vMaterialId3 = int(aMaterialData & 127u);
 // Set material blend
 vMaterialBlend = vec4(0.0);
 if (currentMaterial == 0u) vMaterialBlend.r = 1.0;
 else if (currentMaterial == 1u) vMaterialBlend.g = 1.0;
 else if (currentMaterial == 2u) vMaterialBlend.b = 1.0;
 else vMaterialBlend.a = 1.0;
 ...

And then each pixel in the fragment shader knows exactly what channel each material represents and what interpolated blend value it has in each pixel. And so in the fragment I could do something similar to:

vec4 renderMaterial(int materialId) {
 if (materialId == 1) {
   return renderSoil();
 } else if (materialId == 2) {
   return renderStone()
 } // ...
 return vec4(0.0, 0.0, 0.0, 1.0);
}
vec4 blendMaterials() {
 vec4 finalColor = vec4(0.0);
 if (vMaterialBlend.r > 0.0) {
   finalColor += vMaterialBlend.r * renderMaterial(vMaterialId0);
 }
 if (vMaterialBlend.g > 0.0) {
   finalColor += vMaterialBlend.g * renderMaterial(vMaterialId1);
 }
 if (vMaterialBlend.b > 0.0) {
   finalColor += vMaterialBlend.b * renderMaterial(vMaterialId2);
 }
 if (vMaterialBlend.a > 0.0) {
   finalColor += vMaterialBlend.a * renderMaterial(vMaterialId3);
 }
 return finalColor;
}
// ... use it

So, data-wise it is one uint32 attribute per vertex, 4 flat int varyings and a vec4 varying to blend. Doesn't seem too heavy? What do you think? Target hardware — mid/high-end PCs, no mobile/VR/etc.

Regarding your notice on vertices & colors – after I experiment a bit with my current approach, I'll try to understand a bit better, but if I understood you correctly, I think in my case this shouldn't be a problem, because:

  • Same as mentioned here — I'd agree that for voxel terrains, blending more than 4 different materials per single marching cube voxel is such an extremely rare case that it's not worth optimizing for, a tiny artifact in 1 case out of 10000 that will be under all shading, displacements, foliage and all effects is OK
  • As soon as we meet more than 4 materials inside same chunk processed by transvoxel (my example above), I'll just duplicate vertices instead of reusing same vertex + restart the material mapping. Won't be frequent, a bit of duplication doesn't seem important.

-----------------

So that's a bit more context. I'd love to hear what you think. For me personally the breakthrough was after reading Eric's paper on materials and reading this:

If a vertex is reused from a preceding cell, and the preceding cell selects a different material, then the vertex must be duplicated so that the different texture map indexes can be assigned to it.

As I'm very novice, the idea of just creating same vertex N times on the same position didn't cross my mind at all for some reason.

Because of that I was struggling a lot with a vertex of a cell connected to 6 other vertices, all different materials, totalling 7 materials. In such cases it ended up 7 different channels to blend this, and everything becoming super heavy and I was genuinely stuck.

Now I just duplicate this vertex 7 times with same coordinates but different mappings and I'm happy, since it happens in 0.0001% of cases, while 90% cases is single material, in rare cases — two materials, super rare cases — three.

I'll make an update here as soon as I figure out all specific and see how it looks like, but I feel like I'm moving in some kind of right direction

I'd love to hear what you think

Interesting, and not too different from the stuff i work on, basically quaddominant remeshing:

Sponza scene made from the voxelized model.

Terrain made from simulated particles, and each particle becomes a SDF primitive.

My pipeline is: Volumetric scene representation, iso surface extraction, remesh to get mostly quadrilaterals with edge flow aligning to the curvature of the model.

If i subdivide the mesh once, it's all quads with similar area and minimized distortion, so seamless implicit texturing becomes possible.

So i guess this can become a very flexible and powerful tool for content generation, but i have not really worked on that, nor on efficient rendering. (i primarily need this data for realtime GI, but will try to use it for visible geometry as well later…)
So i can not help much with questions about efficient rendering on the low level, lacking experience.

But i know a lot about the texturing problem itself, which is usually underestimated by programmers, because it's the artists who care about it.

There is an easy way of texturing complex geometry, which is 3D textures. No matter what's the shape, we can just map the 3D coordinates of the surface to 3D coordinates of the volumetric texture. No problem with seams or tiling - it just works.
But 3D textures take too much memory. So we use a cheap trick instead: Triplanar mapping (or related variations of the same idea).
Triplanar mapping basically fakes 3D textures with blending 3 2D texture planes, one per axis.
The blending causes a blur, but if that's acceptable it is an easy way to solve the problem.

The hard way is the traditional one: Map a 2D image on the surface of a complex model.
To do, we usually create UV maps, optimized to have uniform texel size and minimized stretching across some patch of surface. But we can not do this over a whole mesh. We have to segment the mesh into multiple patches, with a discontinuity in UV space on the boundary between those patches, which gives us the typical UV charts. Manually created artists who try to hide the seams in creases or sharp edges.
Doing this automatically for procedural content is very hard, but also eventually too slow for our needs.
One problem which can be solved is texture seams, my work or Disneys PTex would be examples for that.
Becasue it's all quads, mapping blocks of square pixels to it becomes trivial. So the texel on one side of the boundary only needs to match the texel on the other side, and no seam is visible. It can even do displacement mapping without any cracks.
But it still does not solve the primary problem. Say i have such quadmesh mesh, and i have a square image which tiles. Can i put the image on every tile, so my texturing has no seams?
The answer is no. This is because it's impossible to have quadmesh without singular vertices. And if we have vertex with 3 edges, we get the problem of misaligned texture orientation:

I've drawn a red X at the edge where the chess pattern breaks.
The conclusion of this simple problem is: It is not possible to texture complex topology using tiled images.

Thus, as you plan to use tiling textures, those are your options:
1. 3D texturing (triplanar mapping, procedural voulme textures).
2. Use 2D tiles on the 3D surface, but accept seams and try to minimize them.
3. Restrict to height map terrain, making both texturing and LOD almost trivial and solvable in any case.

So if you was not fully aware of this, it might be good to think about it first, before tackling low level rendering details on an approach which later turns out having unexpected problems.
Rendering is not the primary problem here. It's much more about ‘how to create the content - either manually or procedurally, so it looks good’.

It also helps for studying papers. You can classify which of those categories some proposed approach belongs to.

Personally i would ignore concerns about rendering performance, how much bits you need for adjacency inforamtion, etc. I would first try it out just so that it works, then make tools so you can paint materials on your terrain, and only if you are happy start to optimize it. Chances are your not happy, and you might come up with something completely different. So there is the risk of premature optimization.

kotets said:
As I'm very novice, the idea of just creating same vertex N times on the same position didn't cross my mind at all for some reason.

Yeah, it's not obvious. :)
But they do it for every vertex at a UV chart boundary as well. Two vertex have the same pos, the same normal, but different UVs? Duplicate it. Problem solved. Materials also is a very common reason to split meshes.

Sadly i can't help on transvoxel etc., but those not so obvious topological truths are surely worth a thought as well. ;)

Thank you, this is valuable info, I'll do some digging on 3D texture.

I was going triplanar 100%, so the only thing that was left to be solved — how to blend textures that are exactly on the same plane, like on an absolutely flat surface, where it will anyway be the same plane of the triplanar.

But with the approach I'm trying now, looks like it's working fine and not too much overhead on the shaders. I'll need to add more materials for a better screenshot, but I was able to achieve this today already:

In a nutshell, inside I can have from 1 to 4 materials, such as triplanar calculations or any other shader code and then I blend between them:

vec4 renderMaterial(int materialId) {
 if (materialId == 1) {
   return renderSoil();
 } else if (materialId == 2) {
   return renderStone()
 } // ...
 return vec4(0.0, 0.0, 0.0, 1.0);
}

vec4 blendMaterials() {
 vec4 finalColor = vec4(0.0);
 if (vMaterialBlend.r > 0.0) {
   finalColor += vMaterialBlend.r * renderMaterial(vMaterialId0);
 }
 if (vMaterialBlend.g > 0.0) {
   finalColor += vMaterialBlend.g * renderMaterial(vMaterialId1);
 }
 if (vMaterialBlend.b > 0.0) {
   finalColor += vMaterialBlend.b * renderMaterial(vMaterialId2);
 }
 if (vMaterialBlend.a > 0.0) {
   finalColor += vMaterialBlend.a * renderMaterial(vMaterialId3);
 }
 return finalColor;
}

where each material can have its own logic. For example, the soil is triplanar of grass on top, dirt on sides and at the bottom:

vec4 renderSoil() {
        vec3 normal = normalize(vWorldNormal);
        vec3 blendWeights = abs(normal);
        blendWeights /= (blendWeights.x + blendWeights.y + blendWeights.z);

        vec2 uvX = vPosition.yz * uTextureScale;
        vec2 uvY = vPosition.xz * uTextureScale;
        vec2 uvZ = vPosition.xy * uTextureScale;

        vec4 colorX = texture2D(uTextureGrass, uvX);
        vec4 colorY = texture2D(uTextureDirt, uvY);
        vec4 colorZ = texture2D(uTextureDirt, uvZ);

        return colorX * blendWeights.x + colorY * blendWeights.y + colorZ * blendWeights.z;
      }
    

In this organization I like the fact that it's not only blending between 4 textures, but between results of 4 functions — I can use triplanar for one, some fancy shader that generates moss/grass on second, etc.

And given explicit if (vMaterialBlend.r > 0.0) {— if there is 0, we skip any extra logic. Although I believe, given how computers store floats, I need to do something like > some_epsilon to be on the safe side 🤣

I'll need to dedicate a bit more time, but I believe that's the way to go for a voxel-based terrain system.

Thank you for sharing and btw — great work on that remesher. I approximately understand your pipeline, but I'll be honest — very approximately for now, need more experience 😅

Advertisement

kotets said:
I'll need to dedicate a bit more time, but I believe that's the way to go for a voxel-based terrain system.

Yeah, makes sense and no problems with seams or tiling. Also no need to ‘bake’ since the blending isn't that expensive.
Btw, it's possible to improve triplanar mapping, so the blending does not reduce sharpness at some angles: https://eheitzresearch.wordpress.com/722-2/

That's something you could eventually look into at some point i guess. Later there where variations of the technique which are faster at similar quality.

kotets said:
And given explicit if (vMaterialBlend.r > 0.0) {— if there is 0, we skip any extra logic. Although I believe, given how computers store floats, I need to do something like > some_epsilon to be on the safe side 🤣

Yeah, the epsilon might help, but maybe not much.
Remember the GPU can only really skip some code if all threads of a thread group take the same branch. Groups are typically 32 or 64 threads wide, and if only one thread of those needs to calculate a rock material, then all the other threads of the group will run the rock logic too (but they will do no reads from memory, e.g. texture fetches.)
That's why it's often said that branches are 'slow' on GPUs.
But personally i never agreed with this kind of thinking. Telling people they should not use branches is like saying ‘you should never optimize and do brute force all the time’.

So, to make your skipping effective, you would need to assign the same materials to larger areas of the terrain. They you pay the costs of redundant material logic only at material boundaries (if that's a problem at all).

There is only one worry i have with your approach: LOD.
If you use the transvoxel algorithm, which can do continuous LOD transitions afaik, and you store material per vertex, then what happens as vertex resolution decreases?

Oh wow, this is mindblowing. Do you know where can I read more about it?

I don't think I can fully understand how it works. Would you mind chatting a bit more about the topic, so that I get it better?

Because right now it behaves more or less how I expect, to debug, I've put this:

if (vMaterialBlend.r > 0.0) {
  finalColor += vec4(1.0, 0.0, 0.0, 1.0);
}

and dirt became red.

But when I did the following:

if (vMaterialBlend.b > 0.0) {
  finalColor = vec4(0.0, 0.0, 1.0, 1.0);
}

I specifically picked full blue color so that visually you can't miss it and so that it blends directly into the final color. And nothing was blue — so looks like the code inside the if was not run.

Second test — in the first branch I did this:

for (int i = 0; i < 10000; i++) {
  finalColor += vMaterialBlend.r * renderMaterial(vMaterialId0) * 0.0001;
}

, forcing a slide-show level of lags and 10 FPS.

But when I put the same loop into the second if — 60 FPS, all good.

I do not know how to usually debug all of these, but I am having an impression that in the end, on my machine with my setup — the shader doesn't enter the if statement when the channel is empty.

And I made sure I made all of these tests while having a bit of the third material, that triggers the if:

So it is blue, but all the other pixels seem to skip the finalColor = vec4(0.0, 0.0, 1.0, 1.0); Notice the = sign, if that code would run, it would overrite whatever other parts did, right? So it didn't run on other pixels? Or?

I was under an impression that, ok, it's all parallel and stuff, but in the end it is code, there is graphical processor that does some instructions, operating on registers and when you do an if, at a specific moment of time, it specifically reads 0 or not 0 from a specific location, so the result is deterministic at every specific moment of time or?

Can you help me understand? My mind just exploded 😅 I am also glad to not spend your time and to do my homework and read about it somewhere myself, if there are well known sources. I'm just not even sure how to properly google that.

JoeJ said:

There is only one worry i have with your approach: LOD.
If you use the transvoxel algorithm, which can do continuous LOD transitions afaik, and you store material per vertex, then what happens as vertex resolution decreases?

I'm going to follow the approach that is mentioned in the transvoxel paper or maybe slightly modify it, but not too much:

For the highest-resolution cells, we choose the material associated with the voxel at the minimal corner. For lower-resolution cells, we select the material by examining its eight subcells and choosing the material that is most populous among those subcells that are not trivial. This process recurs until the highest resolution is reached, and it ensures that the materials rendered at lower levels of detail are materials that actually participate in the rendering of the highest-detail terrain surface. This process is necessary because materials are often painted by hand, and the voxel data may only be updated at the highest resolution near the terrain surface. If we simply selected the material specified by one of the corner voxels for a low-resolution cell, we are likely to choose some default material identifier that is assigned to a voxel deep inside the terrain in solid space or far outside in empty space.

(Paper)

kotets said:
Oh wow, this is mindblowing. Do you know where can I read more about it? I don't think I can fully understand how it works. Would you mind chatting a bit more about the topic, so that I get it better?

You mean you want resources about how GPUs work in detail?

Hmm, well - i remember this blog posts about the graphics pipeline:
https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/

Iirc this mostly is about the fixed function units of GPUs, e.g. triangle rasterization.

But this does not tell anything about the other application of GPUs - being a parallel processor for general purpose programs. For that i recommend the Compute Shader chapter in the OpenGL Superbible book. It's short but covers everything with good and simple examples.
(Notice Pixel or Vertex Shaders do not expose any parallel programming, only compute shaders do.)

kotets said:
I do not know how to usually debug all of these, but I am having an impression that in the end, on my machine with my setup — the shader doesn't enter the if statement when the channel is empty.

Sadly drawing pixels with some color is indeed the standard way to debug on GPU. : (
Beyond that, i also use a buffer of raw memory printed on screen each frame, and i may manipulate it from GPU programs, e.g. increasing one number with an atomic_add to show how often a certain branch of code was called.

Sadly the tech industry has totally failed to make GPUs available for generic every day programming work.
There is no strong vendor independent language standard and API, there are no proper tools for debugging. It remains a realm for low level nerds.
But at least there are profiling tools, which i can recommend using at some point.

Regarding your confusion, you would need to post the whole shader so i could add my speculation, but i guess the branch is indeed never called, or only in some places you did not realize.
Just keep drawing colors until you figure out why.
But yes, GPUs are as reliable as CPUs. They do not do mistakes or inaccurate math. They do only what we tell them to do.
Ofc. there's a bigger chance of ‘multithreading bugs’, e.g. if different threads write to the same memory location. For graphics things, this usually shows as flicker on screen.

In general, it worked. Probably might be optimized + worked further. But it works and allows for 100% vertex reuse in cases with ≤ 4 materials, while it breaks reuse if one vertex has different materials in same material indexes.

I'll work on triplanar and making it prettier, as you can see, it blends not super nice in some places 😅 Too blurry at times.

Thanks for the links, I'll do my reading on the GPU later

Advertisement