Mostly for what it's worth, I don't think there are a lot of good reasons left these days for ddraw accessing the wined3d front buffer. In particular, in ddraw_surface_update_frontbuffer() we should be able to just blit to the back buffer and then call wined3d_swapchain_present() with a 0 swap interval.
For my own edification, why have we historically rendered directly to the front buffer?
Originally, because it was the straightforward thing to do; we'd translate ddraw front buffer blits to X11/GL front buffer blits. There was no wined3d at that point, nor necessarily OpenGL. (XFree86-DGA was a thing, once upon a time.)
That model was largely translated as-is when ddraw started using wined3d. There's still a fast-path to implement blits from the back buffer to the front buffer as a call to wined3d_swapchain_present() in texture2d_blt(). Then, at some point ddraw_surface_update_frontbuffer() was introduced. The reasons included fixing issues with drawing outside the swapchain window (if any) when using GL, retaining the contents of swapchain surfaces after Flip(), and fixing some performance issues with applications locking the front buffer by keeping track of the affected rectangle. This effectively virtualised ddraw access to the front buffer, not unlike modern windowing systems tend to do; the performance advantages of directly blitting to the front buffer largely no longer exist today. The introduction of "AlwaysOffscreen" effectively removed direct access to the back buffer.
What remained then was that ddraw_surface_update_frontbuffer() never used wined3d_swapchain_present() until commit 034e88e038e8114ec31261d88dece1e2691185fb. These days it does though, and that pretty much leaves potential overhead of wined3d_swapchain_present() as the only reason for not calling it, perhaps most significantly from swapchain_blit(). I think we should try to reduce that overhead in any case.
Thanks for the explanation, that mostly makes sense.
There are of course some optimisations to wined3d's present path we'd like to make regardless.
What do you have in mind?
Broadly, I'd like to reduce the number of blits/copies involved in getting surfaces to the screen, and I think we should be able to make some progress on that using an approach similar to wined3d_buffer_set_bo()/UPLOAD_BO_RENAME_ON_UNMAP. I.e., if we're completely replacing the contents of a texture without scaling or format conversion, we could just propagate the underlying texture/VkImage. Ideally we'd be able to do that all the way to the display driver in the kernel.
So to make sure I understand... the point here is we can't "just" do that with GL, which is why we want to instead back wined3d swapchains with X11 pixmaps and use XPresent to get them to the screen. Or, in the case of Vulkan, actually back wined3d swapchains with Vulkan swapchains (which we don't currently do yet... why?)
I suppose this will be another case of "that's nice to have, but I don't think it should block this patch". If only because there's too much work I have queued at the moment depending on this...