Advertisement

Fast GLSL blur Shader

Started by December 09, 2019 08:05 PM
4 comments, last by iradicator 4 years, 10 months ago

Would be nice if someone can point me to sources for fast (for a Game) GLSL Shader for post-processing Blur effect. The blur amount should be changeable. I know how to use Google and Shadertoy, but it's much efficient to get some pointers from devs with direct experience.

P.S. I can't use mipmaps.

I was just researching and implementing fast blurs recently and I have few things to say.

I would strongly suggest that you downsample your image, blur it and then upsample it back during composition to keep the complexity minimal, especially for large radii. You don't really need mipmaps for this.
Note that you can blur the image even more, depending on how you implemented your downsampling and upsampling processes.

First, make sure that you understand the benefit of using a separable kernel and blur in 2 passes. Gaussian for example is separable. There are lots of articles out there explaining this part.

Do you must use a variable size kernel? or maybe you have a fixed amount of kernel sizes to choose from? that way, you can optimize by using bilinear sampling in between texels as described in here:
hhttp://rastergrid.com/blog/2010/09/efficient-gaussian-blur-with-linear-sampling/ and use shader permutations to choose the kernel size.

If you insist on using Gaussian blur with variable kernel, check out this excellent GPU Gems article on how to incrementally compute it: https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch40.html

I was able to achieve great results by applying simple box filter few times, as explained by Fabian in here: https://fgiesen.wordpress.com/2012/08/01/fast-blurs-2/. This implementation has few neat properties: It's separable and you get to control the kernel support (i.e. blur radius) by increasing the box size. There's also a benefit if you're using a compute shader (see below). The downside is additional passes for smooth blurring.


Finally, if you're willing to use compute shaders, you can use few additional tricks:

* You can actually use exactly the same shader (no switching, or additional push constant / uniform control) by writing to an intermediate transposed image - after even number of passes - you're back to the original size. It's also reducing code complexity because you only need to deal with either the horizontal or vertical case.

* You can implement Gaussian blur (or any convolution really) faster by loading all the pixels first into the local workgroup memory, synchronize, and then compute the blurred value. You will have redundancy at the edges of the workgroup - so it's best to keep the kernel size small..

* For box filter, you can further simplify by using parallel prefix sum implementation in compute shader (lots of articles out there, I believe OpenGL Super Bible has an example of this).


There are some additional techniques such as "Kawase Blur". Check out this nice (though old) summary article:
https://software.intel.com/en-us/blogs/2014/07/15/an-investigation-of-fast-real-time-gpu-based-image-blur-algorithms


Please read my first point again - and consider using a downsampled image.


Lastly, the blur choice will greatly depend on what you're trying to achieve and how your rendering pipeline looks like, since you can fold some costs into other stages and by using MRT.

For example, for Bloom, you can downsample to quarter-res by using bilinear sampling, run 1d box filter 4-6 times (variable size) and then upsample and smooth again (e.g. by using cubic interpolation) during bloom composition. You can even downsample more aggressively if you really need to.


I'm actually working on a blog post to summarize all these findings... I just need to finalize profiler results first - for a complete comparison.

Advertisement

Hello. And thank you for getting back, I think my biggest problem is level of Shader language knowledge. I can't write s Shader just by reading some papers, I need real reference and example.

1) hhttp://rastergrid.com/blog/2010/09/efficient-gaussian-blur-with-linear-sampling/ - I see no explanation, examples how to control the blur amount,

2) https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch40.html same here, plus really complicated paper

3) https://fgiesen.wordpress.com/2012/08/01/fast-blurs-2/ - this features only pseudo code and the link to the method GenBitmap::Blur, well I would say unreadable (no comments, variables like x, gh, ph, dh :))

4) https://software.intel.com/en-us/blogs/2014/07/15/an-investigation-of-fast-real-time-gpu-based-image-blur-algorithms - code provided, but again can't find the GLSL part. GLSL code posted has some unexplained inputs.

I need more a beginner advice, with clear explanation and working GLSL example -with no voodo, but for inputs direction, screensize, bluramount?, uvCoords, texture.

If can share something like that, please!

(..)

(..)

Wow, 3 copies. And I was thinking, that pressing POST button did nothing.

Advertisement

Try to do the following:

1. Start with something very simple like hardcoding the Gaussian coefficients in a table and iterate on these values in the fragment shader. You can use a Pascal triangle to get discrete Gaussian coefficients. Here’s some glsl fragment shader code to get you started:

in VS_OUT
{
  noperspective vec2 uv;
} IN;

out vec4 outColor;

uniform sampler2D offscreen;

const vec4[] gaussKernel3x3 =
{
  vec4(-1.0, -1.0, 0.0,  1.0 / 16.0),
  vec4(-1.0,  0.0, 0.0,  2.0 / 16.0),
  vec4(-1.0, +1.0, 0.0,  1.0 / 16.0),
  vec4( 0.0, -1.0, 0.0,  2.0 / 16.0),
  vec4( 0.0,  0.0, 0.0,  4.0 / 16.0),
  vec4( 0.0, +1.0, 0.0,  2.0 / 16.0),
  vec4(+1.0, -1.0, 0.0,  1.0 / 16.0),
  vec4(+1.0,  0.0, 0.0,  2.0 / 16.0),
  vec4(+1.0, +1.0, 0.0,  1.0 / 16.0),
};

void main(void)
{
  const vec2 texelSize = vec2(1.0) / textureSize(offscreen, 0);

  vec4 color = vec4(0.0);
  for (int i = 0; i < gaussKernel3x3.length(); ++i)
    color += gaussKernel3x3[i].w * texture(offscreen, IN.uv + texelSize * gaussKernel3x3[i].xy);

  outColor = color;
}

What I like about this approach is that you can easily extend it to any filter. This is not production-ready code though, just to get you up and running (you can also add an offset of 0.5 to get a bilinear filtering for free if you want).

2. Change the 2d kernel into 1d and run 2 passes. Most of the work here is OpenGL / Vulkan and not shader code.

3. Use the bilinear sampling trick to reduce the number of texture fetches. This is a math problem: you want to compute w0 * texture(offscreen, uv) + w1 * texture(offscreen, uv + texelSize * vec2(1.0, 0.0)) in a single fetch by adjusting the weights and offset in between.

I think that should get you started for a fixed size Gaussian blur. If you want a variable size kernel, try to do one of the following:

1. Compute the Gaussian kernel incrementally as described in GPU Gems.

2. Successively apply box blur which can be implemented as a compute shader.

3. Use pre-filtered blurred images and a variable size Poisson disk for sampling (see the bottom of this article: https://john-chapman.github.io/2019/03/29/convolution.html) to keep the number of fetches constant.

Good luck!

This topic is closed to new replies.

Advertisement