By looking at Microsoft's working code samples and people's posts on forums, it seems that people are doing double buffering in the following way:
fence 0
frame 0: | submit command | execute command ..................................................... |
frame 1: | submit command | execute command .....|.........................................
It seems that, people tend to submit and execute their commands for the current frame first, _then_ fence and wait for previous frame to finish. This seems counter-intuitive to me, since this potential overlap means duplicating temporary per-frame data.
My question is why not just do it this way instead:
fence 0
frame 0: | submit command | execute command ..........................................|
frame 1: | submit command | execute command .....|.........................................
This way, CPU work still overlaps with GPU work, but no duplication needs to happen. Yes, de-overlapping frame 0 and frame 1 seems to be bad for performance, but can it be that bad? I feel like if frames are finished on time, this overlap should never occur in the first place? So I'm wondering why the majority prefers the first approach to the second approach, even though the second approach is simpler and seems more natural to me. Thanks.