The Peak-Performance Analysis Method for Optimizing Any GPU Workload [wayback-archive]
- how to use GPU counters to detect interactions between different hardware blocks
- discussion of case studies
- walkthrough of analysis, code examples to show optimizations and discussion of results
Optimized Swapchain in Vulkan [wayback-archive]
- discussion of different approaches, overview of strengths and weaknesses
- recommends splitting command buffers into two groups
- pre-acquire (everything that doesn’t write into the swapchain)
- post-acquire (everything that writes into the swapchain)
- a…
Tightening Up the Graphics: Tools and Techniques for Debugging and Optimization
- aimed at beginners in graphics programming
- overview of common problems and causes
- renderdoc explanation
- feature and interface overview
- how to use it to detect problems
- GPU perf studio and how to profile your applica…
Experiments in GPU-based occlusion culling part 2: MultidrawIndirect and mesh lodding [wayback-archive]
- using DrawIndexedInstancedIndirect (on AMD and Nvidia with API extensions)
- all vertices in large buffers with manual vertex fetch in the vertex shader
- how to integrate level of detail for mes…
The Rendering of Middle Earth: Shadow of Mordor [wayback-archive]
- gbuffer breakdown
- blood rendering as gbuffer modifier
- tessellation
- uses a point cloud as input and a tessellation shader calculates the polygons
- SSAO has two distinct channels that get applied to specular and diffuse separately
- t…
Unreal Engine 4 Rendering Part 1: Introduction [wayback-archive]
- focus on the deferred shading pipeline
- settings required for good development experience
- how data is passed from updated thread -> engine thread -> gpu
- how shader system is structured
- binding between HLSLS and C++
- how the corr…
White Paper: Foveated Rendering [wayback-archive]
- overview of typical VR rendering pipeline
- lens distortion causes pixels towards the edges to be distorted more
- Foveated takes advantage of this and uses a variable quality across the screen
- how to implement it using the opengl multiview extension …
PIX 1711.28 – GPU memory usage, TDR debugging, DXIL shader debugging, and child process GPU capture [wayback-archive]
- can track d3d12 heap usage during timing captures
- experimental TDR (Timeout Detection and Recovery) debugging support
- shader debugging for DXIL (new intermediate shader language) …
Decima engine: visibility in horizon zero dawn [wayback-archive]
- seperate system for statics and dynamics
- world broken into tiles
- sort key is used to define clusters
- lod ranges, filter masks
- morton numbers for spatial partioning
- tile/cluster culling on the CPU
- launch one GPU culling job for ea…
Last Week on DirectX Shader Compiler (2017-11-14) [wayback-archive]
- support for explicitly sized types 16 to 64 bits
- spir-v improvements
- Improved performance of occlusion ray packets by up to 50%
Experiments in GPU-based occlusion culling [wayback-archive]
- An GPU occlusion syst…