The only OpenGL vs D3D benchmarks was GL vs D3D9 aka not a reflection on reality of games going forward
The few numbers the Cass Everitt/John McDonald presentations had, were with DX11 as reference. They had one or two DX11 paths, then a bunch of OpenGL paths.
I was specifically referring to the Valve OpenGL vs D3D9 presentation a while back when a lot of people got excited about OpenGL being faster by a couple of fps - probably should have called that out however so My Bad.
However to circle around to your point there was nothing surprising in the talk with regards to the D3D vs OpenGL numbers either; in fact if any graphics programmer WAS surprised by these numbers such that, for example, that map-discard was slower than persistent mapping, they should probably reconsider their knowledge base significantly.
And that's the best you can do now (on NV hardware, see previous comments about AMD and Intel) but, more importantly, this is as good as it gets. That's it. There is no more 'magic' which can be done. What you can submit on one thread is basically fixed going forward.
Which brings me to my D3D12/Mantle point; they have or are going to gain (at least) persistent mapping so suddenly that becomes just as fast. That leaves multi-draw indirect and it's ilk and I'd be surprised if that wasn't exposed to, but even if it isn't I doubt this is going to be a problem.
OpenGL is still late validating state, which is fine and great and if you remove interactions it gets you a long way, but it's still a cost every frame.
Mantle and D3D12's model allows for fixed state creation and validation cost ONCE for, if not all, then many many things, and also allows you to split the workload over threads.
So in a 4 core system, assuming reasonable scaling and a 'worst case' situation of D3D12/Mantle submission of scenes taking the same amount of time as OpenGL (which, given the presentation data I doubt but anyway...), a D3D12/Mantle app could summit draw calls somewhere in the region of 3.9x faster than an OpenGL app. Throw in more cores and that gaps widens.
Now, maybe they don't do any MORE work, so the scene complexity and GPU load is largely the same, but it does mean that the game can get back to doing other processing that much faster.
Even if core counts aren't increasing D3D12/Mantle should allow for somewhere in the region of a 3.9 to 11.n increase in scene submission rate depending on "various factors".
So while those AZDO numbers might look nice that's it.
Enjoy 'em.
You'll not get any more.
(And even then NV only, and outside of "certain hardware set" you'll need falls backs as things like DrawID don't work and maybe never will for certain hardware - which is why I'd like MultiDrawIndirect but, if the hardware can't do useful things with it, I won't be too upset with multi-thread command buffer building and just normal DrawIndirect calls.)