Hi!
I'm working on a third-party plugin/driver which renders through either D3D11 or D3D12 in either exclusive fullscreen or windowed mode.
I have performance problems with SwapChain::Present in case of D3D12.
Both for D3D11 and D3D12, I create a swap chain with 2 backbuffers with _DISCARD swapchain effect and call pSwapChain→Present (0, 0) at the end of the frame.
For the sake of the test, I do all rendering in the main thread. Looking at the threads in the task manager reveals the following:
- With D3D11, the thread crawls at maximum speed (~12% on my machine), giving very high FPS's (1000+)
- With D3D12, the thread crawls at the half of the maximum speed (~6-7%), giving much less FPS's
Using Concurrency Visualizer to determine where the thread is being blocked, I realized that swapchain::Present itself is where the blocking occurs. I just can't figure it out why. The timeline of the thread is full of ‘waiting holes’, see the attached image. The hw is a Geforce GTX 1060:
![](https://uploads.gamedev.net/forums/monthly_2020_02/85511285a9634009800bb23331b9bf67.PresentNV.png)
To make sure that it's not a driver problem, I ran my app on an Intel HD too, and encountered the same phenomenon. Using Concurrency Visualizer again, I got this:
![](https://uploads.gamedev.net/forums/monthly_2020_02/475f6facd4074a4aa54cd3b53f69eba8.PresentIntel.png)
Interestingly, with this driver the Present goes through d3d11 even with d3d12, but again, there is a wait on a present mutex or something like that. It's so bad that even breaking into the rendering by pressing F12 in Visual Studio gives me the stack above in most cases, indicating it spends very much time in the blocking.
My question: why is that and how can I avoid that?
Since I create my swapchain with DXGI_SWAP_EFFECT_FLIP_DISCARD and call Present (0, 0), I'd expect to not have any blocking in the rendering process (aside from some waiting on fences during executing command lists).