After the discussion on IRC I wasn't quite sure what you intended with wined3d_event_query_wait(), but maybe it was something like this:
I've based the sync code on IWineD3DQuery event queries now rather than using the GL APIs. This way the buffer code is insulated from the thread switch issues, the different fence extensions etc.
Issues: * D3D doesn't have a glFinishFence equivalent, just something like glTestFence, which needs to be polled. I guess that's where wined3d_event_query_wait comes into play
* Patch 4 changes the query behavior for apps. If the app just wonders if it can do some more work before sending more D3D commands then this is going to hurt. If the app implements some synchronization around D3DLOCK_NOOVERWRITE then we need this anyway. A separate function would help there as well, although the two functions would then have inconsistent behavior.
* Refcounting. The query holds a ref to the device, the buffer holds a ref to the query, and the device holds a ref to the buffer(if the buffer is set as stream source). However, it should still work because the d3d9 counts are unaffected, and when the app destroys the d3d9 device the stateblock refs are freed which breaks the circle. I could of course add a non-COM structure for event queries and use that, but that shouldn't be needed.