> > I think it's an improvement too, so I'll approve this. I do think there's further room for improvement though, both in terms of performance and in terms of code quality, and I'd prefer seeing those sooner rather than later. (E.g., I don't like the magic "16"; I don't like that we're rolling our own spinlocks here; I don't like the number of atomic operations in what's supposed to be a hot path.)
>
> I gave this some more thinking, and I'm not sure I like the idea of using spinlocks (either implemented by us or by others) any more. They're not wait-free and they don't even coordinate with the operating system, meaning that if a thread is suspended while a spinlock is hold any other thread trying to acquire the same spinlock will spin busily for an entire scheduling quantum (or more). In our case that's slightly different because there is striping, but there are still scenarios in which that can fail (depending on the number of active threads, CPUs and stripe buckets), so I don't like it. While I understand the engineering problems of the wait-free option with the CPU-specific code, I can't help but thinking that after some initial investment that's going to remove some opportunities for stuttering that may be even harder to reproduce and debug later.
I don't particularly like the spinlocks either, but the striping at least somewhat mitigates the issues here. As mentioned before, I think the more problematic parts here are "next_index" and "free_count", which I'd expect to bounce all over the place in the cases we care about. The lock-free list wouldn't really avoid that issue either; it would have similar issues with the list head. The main benefit of thread-local schemes would be that it keeps data local to the thread as much as possible. Or that's the theory anyway; we'd also want some careful benchmarking to be done...
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/297#note_43543
> I think it's an improvement too, so I'll approve this. I do think there's further room for improvement though, both in terms of performance and in terms of code quality, and I'd prefer seeing those sooner rather than later. (E.g., I don't like the magic "16"; I don't like that we're rolling our own spinlocks here; I don't like the number of atomic operations in what's supposed to be a hot path.)
I gave this some more thinking, and I'm not sure I like the idea of using spinlocks (either implemented by us or by others) any more. They're not wait-free and they don't even coordinate with the operating system, meaning that if a thread is suspended while a spinlock is hold any other thread trying to acquire the same spinlock will spin busily for an entire scheduling quantum (or more). In our case that's slightly different because there is striping, but there are still scenarios in which that can fail (depending on the number of active threads, CPUs and stripe buckets), so I don't like it. While I understand the engineering problems of the wait-free option with the CPU-specific code, I can't help but thinking that after some initial investment that's going to remove some opportunities for stuttering that may be even harder to reproduce and debug later.
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/297#note_43540
---
See the added comment for details what is going on.
--
v3: d3d9/tests: Wait longer in test_occlusion_query for software renderers.
d3d9/tests: The device window may restore behind our back in test_wndproc.
d3d9/tests: Work around test_reset_fullscreen failing on gitlab CI.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3565
--
v2: vkd3d-shader/spirv: Handle thread group UAV barriers.
vkd3d-shader/spirv: Include Uniform in the memory semantics for UAV barriers.
vkd3d-shader/spirv: Handle globally coherent UAVs.
vkd3d-shader: Introduce a UAV_GLOBALLY_COHERENT descriptor info flag.
vkd3d-shader: Introduce the RASTERISER_ORDERED_VIEW UAV flag.
vkd3d-shader/tpf: Fix extraction of the UAV declaration flags.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/306
--
v2: gdiplus: Fix GdipCreateBitmapFromHICON bitmap data.
gdiplus/tests: Add test for bitmap locked data from GdipCreateBitmapFromHICON.
gdiplus/tests: Add test for non-square icon with GdipCreateBitmapFromHICON.
gdiplus/tests: Add test for 32 bpp icon with GdipCreateBitmapFromHICON.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3657
Since 1.20, gst_element_request_pad_simple is available and
gst_element_get_request_pad is marked as deprecated.
--
v6: winegstreamer: Add MFMPEG4SinkClassFactory.
mf/tests: Use h264 and aac in mp4 media sink tests.
mf/tests: Add tests for h264 encoder.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3636
It's no problem to send fewer of these per MR. I have included the complete set because all but the last introduce no functional changes, and upstreaming a smaller set would leave the changes in a half-done state with unnecessary buffering.
--
v5: vkd3d: Store command list commands in a buffer until executed.
vkd3d: Store WriteBufferImmediate() arguments in a buffer.
vkd3d: Store ExecuteIndirect() arguments in a buffer.
vkd3d: Store SetPredication() arguments in a buffer.
vkd3d: Store ResolveQueryData() arguments in a buffer.
vkd3d: Store EndQuery() arguments in a buffer.
vkd3d: Store BeginQuery() arguments in a buffer.
vkd3d: Store d3d12_command_list_clear_uav() arguments in a buffer.
vkd3d: Store ClearRenderTargetView() arguments in a buffer.
vkd3d: Store ClearDepthStencilView() arguments in a buffer.
vkd3d: Store OMSetRenderTargets() arguments in a buffer.
vkd3d: Store SOSetTargets() arguments in a buffer.
vkd3d: Store IASetVertexBuffers() arguments in a buffer.
vkd3d: Store IASetIndexBuffer() arguments in a buffer.
vkd3d: Store d3d12_command_list_set_root_descriptor() arguments in a buffer.
vkd3d: Store d3d12_command_list_set_root_cbv() arguments in a buffer.
vkd3d: Store d3d12_command_list_set_root_constants() arguments in a buffer.
vkd3d: Store d3d12_command_list_set_descriptor_table() arguments in a buffer.
vkd3d: Store d3d12_command_list_set_root_signature() arguments in a buffer.
vkd3d: Add an internal refcount to struct d3d12_root_signature.
vkd3d: Store ResourceBarrier() arguments in a buffer.
vkd3d: Store SetPipelineState() arguments in a buffer.
vkd3d: Store OMSetStencilRef() arguments in a buffer.
vkd3d: Store OMSetBlendFactor() arguments in a buffer.
vkd3d: Store RSSetScissorRects() arguments in a buffer.
vkd3d: Store RSSetViewports() arguments in a buffer.
vkd3d: Store IASetPrimitiveTopology() arguments in a buffer.
vkd3d: Store ResolveSubresource() arguments in a buffer.
vkd3d: Store CopyResource() arguments in a buffer.
vkd3d: Store CopyTextureRegion() arguments in a buffer.
vkd3d: Store CopyBufferRegion() arguments in a buffer.
vkd3d: Store Dispatch() arguments in a buffer.
vkd3d: Store DrawIndexedInstanced() arguments in a buffer.
vkd3d: Store DrawInstanced() arguments in a buffer.
This merge request has too many patches to be relayed via email.
Please visit the URL below to see the contents of the merge request.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/294