Advertisement

Deferred device contexts

Started by November 12, 2017 01:38 PM
15 comments, last by Infinisearch 7 years, 2 months ago

I know you can record D3D11 instructions via deferred device context in command buffers which can then be replayed by the immediate context. But how does this work for buffers?

Lets say I need to do the following to render a single model:

  1. Update the model buffer.
  2. Bind the model buffer.
  3. Bind the SRVs.
  4. Bind the input layout.
  5. Draw the model.

Is it possible to concatenate multiple such sequences of commands before replaying? Or do I have to replay after only one sequence, because each sequence re-updates the buffer data?

How many threads with each their own deferred context does one normally set loose on the models? Does one still use the immediate context for direct model rendering, or does it only replay commands?

🧙

When updating buffers, you have two options, UpdateSubResource() and Map(). Updatesubresource just works on deferred contexts normally, meaning, it probably creates a copy of the buffer and uploads to GPU. When using Map(), you can only use WRITE_DISCARD or NO_OVERWRITE flags. WRITE_DISCARD is a buffer rename operation, meaning that it allocates a new copy of the buffer in CPU accessable memory and provides you the pointer that you can write to. When using NO_OVERWRITE, then you say to the driver that you just want access to the memory, and your application will explicitly ensure that there will be no race conditions for the resource, so neither the GPU, nor an other CPU thread will be competing for the same memory. 

Take constant buffers for example which are usually updated with WRITE_DISCARD. If you have a global constant buffer like PerFrameVariables, then you have to update that for each deferred context that will reference it.

Advertisement
53 minutes ago, turanszkij said:

which are usually updated with WRITE_DISCARD

But so assuming, I use the Map function with the flag WRITE_DISCARD for my constant buffers.

It just works in case of multiple deferred contexts performing this Map operation on the same resource at the same time?  I guess yes, but than a local copy needs to be stored as part of the command in the command list?

Furthermore, does it also work if one deferred context performs multiple Map operations to the same resource at the same time? I guess D3D11 uses the same mechanism and multiple local copies are stored for the same command list?

So if my reasoning is correct, this explodes with regard of memory usage if one records huge command lists?

🧙

You on the application side can keep a single buffer resource and you can Map it however many times you like from different contexts as well. The allocations are done by the driver and I assume they allocate constant buffers from the command list memory. You might want to avoid doing this a very high amount per frame, for example AMD GCN drivers have a command buffer of 4MB and once you extend that limit, then there is probably some sort of synchronization involved. 

10 minutes ago, turanszkij said:

for example AMD GCN drivers have a command buffer of 4MB

Ah ok, so some amount of memory is allocated in advance by the driver and more is allocated if you need more.

But 4MB seems quite a lot for some model data for example, so one or two deferred contexts may record all or half of the commands and the immediate context needs to only replay two command lists per frame in this case.

🧙

4 hours ago, matt77hias said:

But 4MB seems quite a lot for some model data for example,

4MB is for command buffers... unless I'm misunderstanding something those are the buffers that tell the gpu what to do.  Not any other data. 

-potential energy is easily made kinetic-

Advertisement
4 hours ago, turanszkij said:

The allocations are done by the driver and I assume they allocate constant buffers from the command list memory.

I don't know about AMD but Nvidia doesn't.  See here: https://developer.nvidia.com/content/constant-buffers-without-constant-pain-0

Also it sort of doesn't make sense.  IIRC gpu's consume command buffers from a circular buffer.  I don't think you would break that continuity with constant buffers.(It would take longer to find a command)  I could be wrong though... maybe someone with more knowledge can chime in.

-potential energy is easily made kinetic-

9 hours ago, turanszkij said:

When updating buffers, you have two options, UpdateSubResource() and Map(). Updatesubresource just works on deferred contexts normally, meaning, it probably creates a copy of the buffer and uploads to GPU. When using Map(), you can only use WRITE_DISCARD or NO_OVERWRITE flags. WRITE_DISCARD is a buffer rename operation, meaning that it allocates a new copy of the buffer in CPU accessable memory and provides you the pointer that you can write to. When using NO_OVERWRITE, then you say to the driver that you just want access to the memory, and your application will explicitly ensure that there will be no race conditions for the resource, so neither the GPU, nor an other CPU thread will be competing for the same memory. 

Take constant buffers for example which are usually updated with WRITE_DISCARD. If you have a global constant buffer like PerFrameVariables, then you have to update that for each deferred context that will reference it.

I thought that if you update this using the immediate context prior to recording your commands in the command buffer you didn't have to update.  Just have to rebind to each context.

Indie game developer - Game WIP

Strafe (Working Title) - Currently in need of another developer and modeler/graphic artist (professional & amateur's artists welcome)

Insane Software Facebook

12 hours ago, ErnieDingo said:

I thought that if you update this using the immediate context prior to recording your commands in the command buffer you didn't have to update.  Just have to rebind to each context.

You are probably right, I have only used them a long time ago, can't remember that well. :)

20 hours ago, ErnieDingo said:

I thought that if you update this using the immediate context prior to recording your commands in the command buffer you didn't have to update.  Just have to rebind to each context.

But does the command allocate memory for buffer mappings or buffer updates?

🧙

This topic is closed to new replies.

Advertisement