Advertisement

DirectX 11 Device context queston.

Started by October 22, 2018 02:11 PM
8 comments, last by GuyWithBeard 6 years, 3 months ago

I had a question about the ID3D11DeviceContext for any DirectX11 gurus out there. I'm trying to add threads to my engine and it started crashing. After a quick google I discovered that you can't safely use the ID3D11DeviceContext in more than one thread without some sort of synchronization.  I suppose I can put mutexes around things but then I noticed there was something called a deferred device context but I'm not exactly clear on the details of how it works.  What I'm trying to do is  create a new tree of meshes while rending the current tree of meshes. When the new tree is ready I'll do a swap. This is for LOD. At some point I want to do a fade in and out sort of thing, but for right now I'll just have the new tree blink in.

In any case I need the device context to do the rendering in one thread,  but I also need it to create, map and unmap, vertex and index buffers in a second thread. My questions are:

1) Is there a standard way to do what I'm tying to do?

2) Is deferred rendering something I can use to achieve this, and if so is there a good online resource (preferably with examples)  that someone can point me to?

Thanks

And while I'm at it, one more question.... If I'm using mutexes for this do I need to, or should I, hold the device context for the whole write to buffer, or can I just grab it for the map and release it, and then grab it again for the unmap and release it again?

Could you use temporary buffers in your construction-thread, and when the thread is finished do a create, map, copy, unmap to a DX11 buffer from your main app?

.:vinterberg:.

Advertisement
13 minutes ago, vinterberg said:

Could you use temporary buffers in your construction-thread, and when the thread is finished do a create, map, copy, unmap to a DX11 buffer from your main app?

I actually kind of do that anyway.  The meshes are created in an voxel octree. I'm just wondering what the best way to handle the copy to card is in a threaded environment.

I should add that on the CPU side it's my own mesh format so it's not just a memcpy to the buffers, however it can be done with a reasonably simple loop.

Before you do any big rewrite using deferred contexts, I warn you that AMD gpus don't support those (last I checked). Also, the DX11 implementation for these can't be very efficient, because resource dependencies can't be declared in the API, so the driver will serialize your command lists when you call ExecuteCommandLists and validate dependencies between them anyway. I'm not saying that there is no gain from this, I think some games used it to success (Civilization V afaik).

An other method would be to have your own command list implementation that in the end would generate DX11 commands when you submit it. This would be a bigger task to implement, but it would work on Nvidia, AMD, etc...

Anyway, you can Map on Deferred contexts with WRITE_DISCARD flag and NO_OVERWRITE flag if I remember correctly. You can use UpdateSubresource() too. But you can't read back data with MAP_READ, or read query results.

2 hours ago, turanszkij said:

Before you do any big rewrite using deferred contexts, I warn you that AMD gpus don't support those (last I checked). Also, the DX11 implementation for these can't be very efficient, because resource dependencies can't be declared in the API, so the driver will serialize your command lists when you call ExecuteCommandLists and validate dependencies between them anyway. I'm not saying that there is no gain from this, I think some games used it to success (Civilization V afaik).

An other method would be to have your own command list implementation that in the end would generate DX11 commands when you submit it. This would be a bigger task to implement, but it would work on Nvidia, AMD, etc...

Anyway, you can Map on Deferred contexts with WRITE_DISCARD flag and NO_OVERWRITE flag if I remember correctly. You can use UpdateSubresource() too. But you can't read back data with MAP_READ, or read query results.

I'm not sure I understood everything but I did managed to get it working with mutexes. I guess I'll stick with that for now and revisit deferred contexts later.

3 hours ago, turanszkij said:

Before you do any big rewrite using deferred contexts, I warn you that AMD gpus don't support those (last I checked).

Hmm, where did you read that? I have been using deferred contexts quite happily for the last year or so. My main development machine has a Radeon RX 580.

Advertisement
2 hours ago, GuyWithBeard said:

Hmm, where did you read that? I have been using deferred contexts quite happily for the last year or so. My main development machine has a Radeon RX 580.

I actually tried it on multiple AMD gpus, the one of them being a RX470 if I remember. But it was more than 2 years ago, maybe they caught up now.

I can't check on AMD right now, but I checked on my Intel integrated 620 and that also doesn't support it.

 

On 10/22/2018 at 1:30 PM, GuyWithBeard said:

Hmm, where did you read that? I have been using deferred contexts quite happily for the last year or so. My main development machine has a Radeon RX 580.

AMD doesn't support the "Driver Command Lists" feature for deferred contexts. This means that the D3D11 runtime lets you use deferred contexts, but instead of storing commands in actual hw-specific command buffers it will store them in a device-agnostic intermediate buffers. The runtime will then serialize those commands and pass them to the driver to create the final command buffer for the GPU. While this can possibly let you parallelize certain aspects of submitting commands, the actual command buffer generation is going to happen on a single thread. Thus you may be better off using a single thread and letting the CPU reach peak turbo clocks instead of trying to use multiple threads/cores to generate deferred command lists.

Like turanszkij mentioned, D3D11 is just a really poor fit for multithreading in terms of its core level of abstraction. It tries to hide dependencies and asynchronous execution from you, and it's hard to do all of that hiding and abstraction unless the command submission is single-threaded or otherwise serialized.

Excellent explanation MJP, thanks. I have been using deferred contexts solely because they more closely match the command list/command buffer concept used by DX12/Vulkan. While I support all three APIs my renderer is in fact still single threaded so I haven't done any serious multithreaded command submission yet, which is why I haven't looked all that much into the performance of the deferred contexts vs. immediate contexts on my AMD machine.

This topic is closed to new replies.

Advertisement