So I've been trying to implement triple-buffering in my application by changing the BufferCount* parameter of DXGI_SWAP_CHAIN_DESC, but regardless of what I set it to, there is no detectable change in the performance or latency of my application. Let me elaborate...
I would expect that increasing the number of swap chain buffers would lead to an increase in latency. So I started experimenting: First, I added a 50ms sleep to every frame so as to artificially limit the FPS to about 20. Then I tried setting BufferCount to 1, 2, 4, 8, and 16 (the highest it would go without crashing) and tested latency by moving my game's camera. With a BufferCount of 1 and an FPS of ~19, my game was choppy but otherwise had low latency. Now, with a BufferCount of 16 I would expect 16 frames of latency, which at ~19 FPS is almost a whole second of lag. Certainly this should be noticeable just moving the game camera, but there was no more latency than there was with a BufferCount of 1. (And none of the other values I tried had any effect either.)
Another possibly-related thing that's confusing me: I read that with vsync on (and no triple-buffering), the FPS should be locked to an integer divisor of your monitor's refresh rate (i.e., 60, 30, 20, 15, etc...) since any frame that takes longer than a vertical blank needs to wait until the next one before being presented. And indeed, when I give Present a SyncInterval of 1, my FPS is capped at 60. But my FPS does *not* drop to 30 once a frame takes longer than 1/60 of a second as I would expect; if I get about 48 FPS with vsync off then I still get about 48 FPS with vsync on. (And no, this isn't a result of averaging of frame times. I'm recording individual frame times and they're all very stable at around 1/48 second. I've also checked my GPU settings for any kind of adaptive vsync but couldn't find any.)
More details:
I'm testing this in (I think exclusive) fullscreen, though I've tested in windowed mode as well. (I've fixed all the DXGI runtime warnings about fullscreen performance issues, so I'm pretty sure I have my swap chain configured correctly)
If it matters, I'm using DXGI_SWAP_EFFECT_DISCARD (but have tested SEQUENTIAL, FLIP_SEQUENTIAL, and FLIP_DISCARD with no apparent effect).
I've tried calling Present with a SyncInterval of both 0 (no vsync) and 1 (vsync every vertical blank). Using 1 adds small but noticeable latency as one would expect, but increasing BufferCount doesn't add to it.
I've tested on three computers: One with a GTX970, one with a mobile Radeon R9 M370X, and one virtual machine running on VirtualBox. All exhibit this same behavior (or lack thereof)
So can anyone explain why I'm not seeing any change in latency or locking to 60/30/20/... FPS with vsync on? Am I doing something wrong? Am I not understanding how swap chains work? Is the graphics driver being too clever?
Thanks for your help!
*(As an aside, does anyone know for sure what I *should* be setting BufferCount to for double- and triple-buffering? In some places I've read that it should be set to 1 and 2 respectively for double and triple buffering, but in some other places they say set it to 2 and 3.)