I don't think I can review 4/5 without understanding why that code is there in the first place. It was introduced by a7c320d184fb, which didn't provide a reason. I'd assume that there's a Vulkan-like "all GPU commands have to be done before destroying an object" requirement, but a quick (too quick?) reading of ARB_sync (specifically § D.2.1) implies that's specifically not the case here. And that doesn't explain the glFinish() added in 3cc8147594de either. Why do we need that?
It's not about destroying the fences themselves, but about destroying the resources protected by those fences. In particular, wined3d_context_gl_cleanup_resources() is run from wined3d_context_gl_wait_command_fence(). There may also be a requirement to flush any pending (draw) commands on the context; I don't quite remember whether the wglDeleteContext()/wglMakeCurrent() in wined3d_context_gl_cleanup() may do that implicitly, but they might.
As for the glFinish() from commit 3cc8147594de868884d3e57babff8eef058c63a3, the idea was to make sure all pending GL commands have finished like wined3d_context_gl_wait_command_fence() would. Unfortunately we can't just run wined3d_context_gl_cleanup_resources() on a destroyed context, but it should be fine; we'll either get a chance to clean things up when the new context we switch to submits a fence or waits for one, or we're destroying the device and device destruction will take care of it.
I think I'm still missing something. I assume that the rub is "GL sync is implicit *until* we start using ARB_sync and persistent maps, or similarly APPLE_flush_buffer_range", but that doesn't explain the use of glFinish(), which presumably wouldn't be required in that case. [I also don't understand why a7c320d184fb specifically adds that wait to wined3d_context_gl_cleanup()? Unless it's incidental to that change.]
Could we get rid of this code entirely? Possibly. We'd end up with slightly different semantics for GL context cleanup and Vulkan context cleanup in terms of what commands are guaranteed to have been completed. We may end up holding on the bo allocations a bit longer than we otherwise would. Perhaps that's fine.
I think it makes sense to use similar code, I'm just struggling a little to understand why it's necessary to explicitly synchronize here.