Hi,
I use the following shader (HLSL) to scale positions, normals, and depth for post processing:
Texture2D normalTexture : register(t0); Texture2D positionTexture : register(t1); Texture2D<float> depthTexture : register(t2); static const uint POST_PROCESSING_SCALING_MASK = 0xC0; static const uint POST_PROCESSING_SCALING_FULL = 0x00; static const uint POST_PROCESSING_SCALING_HALF = 0x40; static const uint POST_PROCESSING_SCALING_QUARTER = 0x80; cbuffer PixelBuffer{ float fogStart; float fogMax; uint Flags; float Gamma; float3 fogColor; float SpaceAlpha; float FrameTime; float MinHDRBrightness; float MaxHDRBrightness; float ScreenWidth; float3 SemiTransparentLightColor; float ScreenHeight; float SpaceAlphaFarAway; float SpaceAlphaNearPlanets; float HDRFalloffFactor; float InverseGamma; float3 CameraLiquidColor; float CameraLiquidVisualRange; float CameraInLiquidMinLerpFactor; float CameraInLiquidMaxLerpFactor; float MinCloudBrightness; float MaxCloudBrightness; float4 BorderColor; float3 SunColor; float padding0; }; struct PixelInputType{ float4 position : SV_POSITION; }; struct PixelOutputType{ float4 normal : SV_Target0; float4 position : SV_Target1; float depth : SV_Depth; }; PixelOutputType main(PixelInputType input){ PixelOutputType output; const uint ScalingFlag = (Flags & POST_PROCESSING_SCALING_MASK) >> 6; uint3 TexCoords = uint3(uint2(input.position.xy) << ScalingFlag, 0); const uint Max = (1 << ScalingFlag) << ScalingFlag; uint UsedIndex = 0xFFFFFFFF; for (uint i = 0; i < Max && i < 16; ++i) { const uint3 CurrentTexCoords = TexCoords + uint3(i & ((1 << ScalingFlag) - 1), i >> ScalingFlag, 0); output.position = positionTexture.Load(CurrentTexCoords); if (output.position.w >= 0.0f) { UsedIndex = i; break; } } if (UsedIndex == 0xFFFFFFFF) { output.normal = float4(0, 0, 0, 0); output.depth = 1.0f; discard; } else { const uint3 CurrentTexCoords2 = TexCoords + uint3(UsedIndex & ((1 << ScalingFlag) - 1), UsedIndex >> ScalingFlag, 0); output.normal = normalTexture.Load(CurrentTexCoords2); output.depth = depthTexture.Load(CurrentTexCoords2); } return output; }
It doesn't work with the latest drivers on AMD graphics cards, neither on my laptop (R7 M270) nor one of my friend's computers (RX 480).
It works perfectly on Nvidia/Intel GPUs and it worked with older drivers on my laptop, too.
On AMD GPUs, it outputs wrong data, i.e. output.position.w < 0 even though that is impossible (if it is < 0, UsedIndex will be 0xFFFFFFFF, therefore the pixel will be discarded). If I manually set output.position.w = 0 in both of the branches on (UsedIndex == 0xFFFFFFFF), it outputs 0 in the w component. If I set it in only one of the branches (doesn't matter which one), then output.position.w is negative in the output.
The other output (positions.xyz, normals.xyz, and depth) seems to be correct, but I'm not sure about normals.w.
It literally makes no sense and I think it's a driver bug, what can I do?
Cheers,
Magogan
Edit: If I add the following code before returning output, it gets even weirder:
if (output.position.w < 0) {
output.position.w = 0;
}
else {
output.position.w = 1;
}
After that, output.position.w is neither 0 nor 1, but something such that it is not >= 0, so probably NAN or negative. WTF?