This MR completes https://gitlab.winehq.org/wine/wine/-/merge_requests/3165. The architecture was changed slightly from what is explained there after some discussion: instead of adding special support for `vkQueueSubmit()` and `vkQueuePresentKHR()`, vkd3d is not changed. Instead, the behavior of `vkd3d_acquire_vk_queue()` is specified, with a recommendation on how to ensure that frames are presented in order (see https://gitlab.winehq.org/wine/vkd3d/-/commit/b2a1f6b5e4f59fbc7f91ada7e56563...). The recommended behavior is implemented in this MR, and it is known to fix presentation bugs on The Medium.
From: Giovanni Mascellani gmascellani@codeweavers.com
In the following commits the worker thread will be used for actual presenting and buffer resizing. In this way the worker thread can wait on the vkd3d queue without blocking or even deadlocking the application. --- dlls/dxgi/swapchain.c | 54 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+)
diff --git a/dlls/dxgi/swapchain.c b/dlls/dxgi/swapchain.c index 34678a3a3a7..8a84ea30065 100644 --- a/dlls/dxgi/swapchain.c +++ b/dlls/dxgi/swapchain.c @@ -1024,6 +1024,11 @@ struct d3d12_swapchain VkDevice vk_device; VkPhysicalDevice vk_physical_device;
+ HANDLE worker_thread; + CRITICAL_SECTION worker_cs; + CONDITION_VARIABLE worker_cv; + bool worker_running; + /* D3D12 side of the swapchain (frontend): these objects are * visible to the IDXGISwapChain client, so they must never be * recreated, except when ResizeBuffers*() is called. */ @@ -1070,6 +1075,26 @@ struct d3d12_swapchain uint32_t frame_latency; };
+static DWORD WINAPI d3d12_swapchain_worker_proc(void *data) +{ + struct d3d12_swapchain *swapchain = data; + + EnterCriticalSection(&swapchain->worker_cs); + + while (swapchain->worker_running) + { + if (!SleepConditionVariableCS(&swapchain->worker_cv, &swapchain->worker_cs, INFINITE)) + { + ERR("Cannot sleep on condition variable, last error %ld.\n", GetLastError()); + break; + } + } + + LeaveCriticalSection(&swapchain->worker_cs); + + return 0; +} + static DXGI_FORMAT dxgi_format_from_vk_format(VkFormat vk_format) { switch (vk_format) @@ -1808,6 +1833,23 @@ static void d3d12_swapchain_destroy(struct d3d12_swapchain *swapchain) { const struct dxgi_vk_funcs *vk_funcs = &swapchain->vk_funcs; void *vulkan_module = vk_funcs->vulkan_module; + DWORD ret; + + EnterCriticalSection(&swapchain->worker_cs); + swapchain->worker_running = false; + WakeAllConditionVariable(&swapchain->worker_cv); + LeaveCriticalSection(&swapchain->worker_cs); + + if (swapchain->worker_thread) + { + if ((ret = WaitForSingleObject(swapchain->worker_thread, INFINITE)) != WAIT_OBJECT_0) + ERR("Failed to wait for worker thread, return value %ld.\n", ret); + + if (!CloseHandle(swapchain->worker_thread)) + ERR("Failed to close worker thread, last error %ld.\n", GetLastError()); + } + + DeleteCriticalSection(&swapchain->worker_cs);
d3d12_swapchain_destroy_vulkan_resources(swapchain); d3d12_swapchain_destroy_d3d12_resources(swapchain); @@ -2996,6 +3038,10 @@ static HRESULT d3d12_swapchain_init(struct d3d12_swapchain *swapchain, IWineDXGI
wined3d_private_store_init(&swapchain->private_store);
+ InitializeCriticalSection(&swapchain->worker_cs); + InitializeConditionVariable(&swapchain->worker_cv); + swapchain->worker_running = true; + surface_desc.sType = VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR; surface_desc.pNext = NULL; surface_desc.flags = 0; @@ -3069,6 +3115,14 @@ static HRESULT d3d12_swapchain_init(struct d3d12_swapchain *swapchain, IWineDXGI
IWineDXGIFactory_AddRef(swapchain->factory = factory);
+ if (!(swapchain->worker_thread = CreateThread(NULL, 0, d3d12_swapchain_worker_proc, swapchain, 0, NULL))) + { + hr = HRESULT_FROM_WIN32(GetLastError()); + WARN("Failed to create worker thread, hr %#lx.\n", hr); + d3d12_swapchain_destroy(swapchain); + return hr; + } + return S_OK; }
From: Giovanni Mascellani gmascellani@codeweavers.com
The main thread still waits for the worker thread, in order to keep buffer resizing synchronized with presentation. As soon as buffer presentation is offloaded to the worker thread too the wait can be dropped. --- dlls/dxgi/swapchain.c | 156 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 128 insertions(+), 28 deletions(-)
diff --git a/dlls/dxgi/swapchain.c b/dlls/dxgi/swapchain.c index 8a84ea30065..f140317781e 100644 --- a/dlls/dxgi/swapchain.c +++ b/dlls/dxgi/swapchain.c @@ -1028,6 +1028,8 @@ struct d3d12_swapchain CRITICAL_SECTION worker_cs; CONDITION_VARIABLE worker_cv; bool worker_running; + struct list worker_ops; + HANDLE worker_event;
/* D3D12 side of the swapchain (frontend): these objects are * visible to the IDXGISwapChain client, so they must never be @@ -1075,6 +1077,33 @@ struct d3d12_swapchain uint32_t frame_latency; };
+enum d3d12_swapchain_op_type +{ + D3D12_SWAPCHAIN_OP_PRESENT, +}; + +struct d3d12_swapchain_op +{ + struct list entry; + enum d3d12_swapchain_op_type type; + union + { + struct + { + unsigned int sync_interval; + VkImage vk_image; + unsigned int frame_number; + } present; + }; +}; + +static void d3d12_swapchain_op_destroy(struct d3d12_swapchain_op *op) +{ + heap_free(op); +} + +static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapchain, struct d3d12_swapchain_op *op); + static DWORD WINAPI d3d12_swapchain_worker_proc(void *data) { struct d3d12_swapchain *swapchain = data; @@ -1083,7 +1112,29 @@ static DWORD WINAPI d3d12_swapchain_worker_proc(void *data)
while (swapchain->worker_running) { - if (!SleepConditionVariableCS(&swapchain->worker_cv, &swapchain->worker_cs, INFINITE)) + if (!list_empty(&swapchain->worker_ops)) + { + struct d3d12_swapchain_op *op = LIST_ENTRY(list_head(&swapchain->worker_ops), struct d3d12_swapchain_op, entry); + + list_remove(&op->entry); + + LeaveCriticalSection(&swapchain->worker_cs); + + switch (op->type) + { + case D3D12_SWAPCHAIN_OP_PRESENT: + d3d12_swapchain_op_present_execute(swapchain, op); + break; + } + + d3d12_swapchain_op_destroy(op); + + if (!SetEvent(swapchain->worker_event)) + ERR("Cannot set event, last error %ld.\n", GetLastError()); + + EnterCriticalSection(&swapchain->worker_cs); + } + else if (!SleepConditionVariableCS(&swapchain->worker_cv, &swapchain->worker_cs, INFINITE)) { ERR("Cannot sleep on condition variable, last error %ld.\n", GetLastError()); break; @@ -1833,6 +1884,7 @@ static void d3d12_swapchain_destroy(struct d3d12_swapchain *swapchain) { const struct dxgi_vk_funcs *vk_funcs = &swapchain->vk_funcs; void *vulkan_module = vk_funcs->vulkan_module; + struct d3d12_swapchain_op *op, *op2; DWORD ret;
EnterCriticalSection(&swapchain->worker_cs); @@ -1849,8 +1901,16 @@ static void d3d12_swapchain_destroy(struct d3d12_swapchain *swapchain) ERR("Failed to close worker thread, last error %ld.\n", GetLastError()); }
+ if (!CloseHandle(swapchain->worker_event)) + ERR("Failed to close worker event, last error %ld.\n", GetLastError()); + DeleteCriticalSection(&swapchain->worker_cs);
+ LIST_FOR_EACH_ENTRY_SAFE(op, op2, &swapchain->worker_ops, struct d3d12_swapchain_op, entry) + { + d3d12_swapchain_op_destroy(op); + } + d3d12_swapchain_destroy_vulkan_resources(swapchain); d3d12_swapchain_destroy_d3d12_resources(swapchain);
@@ -2055,29 +2115,13 @@ static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, return vr; }
-static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, - unsigned int sync_interval, unsigned int flags) +static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapchain, struct d3d12_swapchain_op *op) { - HANDLE frame_latency_event; VkQueue vk_queue; VkResult vr; HRESULT hr;
- if (sync_interval > 4) - { - WARN("Invalid sync interval %u.\n", sync_interval); - return DXGI_ERROR_INVALID_CALL; - } - - if (flags & ~DXGI_PRESENT_TEST) - FIXME("Unimplemented flags %#x.\n", flags); - if (flags & DXGI_PRESENT_TEST) - { - WARN("Returning S_OK for DXGI_PRESENT_TEST.\n"); - return S_OK; - } - - if (FAILED(hr = d3d12_swapchain_set_sync_interval(swapchain, sync_interval))) + if (FAILED(hr = d3d12_swapchain_set_sync_interval(swapchain, op->present.sync_interval))) return hr;
if (!(vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue))) @@ -2086,8 +2130,7 @@ static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, return E_FAIL; }
- vr = d3d12_swapchain_queue_present(swapchain, vk_queue, - swapchain->vk_images[swapchain->current_buffer_index]); + vr = d3d12_swapchain_queue_present(swapchain, vk_queue, op->present.vk_image); if (vr == VK_ERROR_OUT_OF_DATE_KHR) { vkd3d_release_vk_queue(swapchain->command_queue); @@ -2104,8 +2147,7 @@ static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, return E_FAIL; }
- if ((vr = d3d12_swapchain_queue_present(swapchain, vk_queue, - swapchain->vk_images[swapchain->current_buffer_index])) < 0) + if ((vr = d3d12_swapchain_queue_present(swapchain, vk_queue, op->present.vk_image)) < 0) ERR("Failed to present after recreating swapchain, vr %d.\n", vr); }
@@ -2117,12 +2159,11 @@ static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, return hresult_from_vk_result(vr); }
- ++swapchain->frame_number; - if ((frame_latency_event = swapchain->frame_latency_event)) + if (swapchain->frame_latency_fence) { - /* Bias the frame number to avoid underflowing in - * SetEventOnCompletion(). */ - uint64_t number = swapchain->frame_number + DXGI_MAX_SWAP_CHAIN_BUFFERS; + /* Use the same bias as d3d12_swapchain_present(). Add one to + * account for the "++swapchain->frame_numer" there. */ + uint64_t number = op->present.frame_number + DXGI_MAX_SWAP_CHAIN_BUFFERS + 1;
if (FAILED(hr = ID3D12CommandQueue_Signal(swapchain->command_queue, swapchain->frame_latency_fence, number))) @@ -2130,6 +2171,56 @@ static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, ERR("Failed to signal frame latency fence, hr %#lx.\n", hr); return hr; } + } + + return S_OK; +} + +static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, + unsigned int sync_interval, unsigned int flags) +{ + struct d3d12_swapchain_op *op; + HANDLE frame_latency_event; + HRESULT hr; + + if (sync_interval > 4) + { + WARN("Invalid sync interval %u.\n", sync_interval); + return DXGI_ERROR_INVALID_CALL; + } + + if (flags & ~DXGI_PRESENT_TEST) + FIXME("Unimplemented flags %#x.\n", flags); + if (flags & DXGI_PRESENT_TEST) + { + WARN("Returning S_OK for DXGI_PRESENT_TEST.\n"); + return S_OK; + } + + if (!(op = heap_alloc_zero(sizeof(*op)))) + { + WARN("Cannot allocate memory.\n"); + return E_OUTOFMEMORY; + } + + op->type = D3D12_SWAPCHAIN_OP_PRESENT; + op->present.sync_interval = sync_interval; + op->present.vk_image = swapchain->vk_images[swapchain->current_buffer_index]; + op->present.frame_number = swapchain->frame_number; + + EnterCriticalSection(&swapchain->worker_cs); + list_add_tail(&swapchain->worker_ops, &op->entry); + WakeAllConditionVariable(&swapchain->worker_cv); + LeaveCriticalSection(&swapchain->worker_cs); + + WaitForSingleObject(swapchain->worker_event, INFINITE); + + ++swapchain->frame_number; + if ((frame_latency_event = swapchain->frame_latency_event)) + { + /* Bias the frame number to avoid underflowing in + * SetEventOnCompletion(). */ + uint64_t number = swapchain->frame_number + DXGI_MAX_SWAP_CHAIN_BUFFERS;
if (FAILED(hr = ID3D12Fence_SetEventOnCompletion(swapchain->frame_latency_fence, number - swapchain->frame_latency, frame_latency_event))) @@ -3041,6 +3132,15 @@ static HRESULT d3d12_swapchain_init(struct d3d12_swapchain *swapchain, IWineDXGI InitializeCriticalSection(&swapchain->worker_cs); InitializeConditionVariable(&swapchain->worker_cv); swapchain->worker_running = true; + list_init(&swapchain->worker_ops); + + if (!(swapchain->worker_event = CreateEventA(NULL, FALSE, FALSE, NULL))) + { + hr = HRESULT_FROM_WIN32(GetLastError()); + WARN("Failed to create worker event, hr %#lx.\n", hr); + d3d12_swapchain_destroy(swapchain); + return hr; + }
surface_desc.sType = VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR; surface_desc.pNext = NULL;
From: Giovanni Mascellani gmascellani@codeweavers.com
--- dlls/dxgi/swapchain.c | 100 +++++++++++++++++++++++++++++++++--------- 1 file changed, 80 insertions(+), 20 deletions(-)
diff --git a/dlls/dxgi/swapchain.c b/dlls/dxgi/swapchain.c index f140317781e..ea93efb5643 100644 --- a/dlls/dxgi/swapchain.c +++ b/dlls/dxgi/swapchain.c @@ -1055,6 +1055,7 @@ struct d3d12_swapchain unsigned int vk_swapchain_width; unsigned int vk_swapchain_height; VkPresentModeKHR present_mode; + DXGI_SWAP_CHAIN_DESC1 backend_desc;
uint32_t vk_image_index;
@@ -1080,6 +1081,7 @@ struct d3d12_swapchain enum d3d12_swapchain_op_type { D3D12_SWAPCHAIN_OP_PRESENT, + D3D12_SWAPCHAIN_OP_RESIZE_BUFFERS, };
struct d3d12_swapchain_op @@ -1094,15 +1096,39 @@ struct d3d12_swapchain_op VkImage vk_image; unsigned int frame_number; } present; + struct + { + /* The resize buffers op takes ownership of the memory and + * images used to back the previous ID3D12Resource + * objects, so that they're only released once the worker + * thread is done with them. */ + VkDeviceMemory vk_memory; + VkImage vk_images[DXGI_MAX_SWAP_CHAIN_BUFFERS]; + DXGI_SWAP_CHAIN_DESC1 desc; + } resize_buffers; }; };
-static void d3d12_swapchain_op_destroy(struct d3d12_swapchain_op *op) +static void d3d12_swapchain_op_destroy(struct d3d12_swapchain *swapchain, struct d3d12_swapchain_op *op) { + const struct dxgi_vk_funcs *vk_funcs = &swapchain->vk_funcs; + unsigned int i; + + if (op->type == D3D12_SWAPCHAIN_OP_RESIZE_BUFFERS) + { + assert(swapchain->vk_device); + + for (i = 0; i < DXGI_MAX_SWAP_CHAIN_BUFFERS; ++i) + vk_funcs->p_vkDestroyImage(swapchain->vk_device, op->resize_buffers.vk_images[i], NULL); + + vk_funcs->p_vkFreeMemory(swapchain->vk_device, op->resize_buffers.vk_memory, NULL); + } + heap_free(op); }
static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapchain, struct d3d12_swapchain_op *op); +static HRESULT d3d12_swapchain_op_resize_buffers_execute(struct d3d12_swapchain *swapchain, struct d3d12_swapchain_op *op);
static DWORD WINAPI d3d12_swapchain_worker_proc(void *data) { @@ -1125,9 +1151,13 @@ static DWORD WINAPI d3d12_swapchain_worker_proc(void *data) case D3D12_SWAPCHAIN_OP_PRESENT: d3d12_swapchain_op_present_execute(swapchain, op); break; + + case D3D12_SWAPCHAIN_OP_RESIZE_BUFFERS: + d3d12_swapchain_op_resize_buffers_execute(swapchain, op); + break; }
- d3d12_swapchain_op_destroy(op); + d3d12_swapchain_op_destroy(swapchain, op);
if (!SetEvent(swapchain->worker_event)) ERR("Cannot set event, last error %ld.\n", GetLastError()); @@ -1702,7 +1732,7 @@ static HRESULT d3d12_swapchain_create_vulkan_swapchain(struct d3d12_swapchain *s HRESULT hr;
if (FAILED(hr = select_vk_format(vk_funcs, vk_physical_device, - swapchain->vk_surface, &swapchain->desc, &vk_swapchain_format))) + swapchain->vk_surface, &swapchain->backend_desc, &vk_swapchain_format))) return hr;
if ((vr = vk_funcs->p_vkGetPhysicalDeviceSurfaceCapabilitiesKHR(vk_physical_device, @@ -1712,28 +1742,28 @@ static HRESULT d3d12_swapchain_create_vulkan_swapchain(struct d3d12_swapchain *s return hresult_from_vk_result(vr); }
- image_count = swapchain->desc.BufferCount; + image_count = swapchain->backend_desc.BufferCount; image_count = max(image_count, surface_caps.minImageCount); if (surface_caps.maxImageCount) image_count = min(image_count, surface_caps.maxImageCount);
- if (image_count != swapchain->desc.BufferCount) + if (image_count != swapchain->backend_desc.BufferCount) { - WARN("Buffer count %u is not supported (%u-%u).\n", swapchain->desc.BufferCount, + WARN("Buffer count %u is not supported (%u-%u).\n", swapchain->backend_desc.BufferCount, surface_caps.minImageCount, surface_caps.maxImageCount); }
- width = swapchain->desc.Width; - height = swapchain->desc.Height; + width = swapchain->backend_desc.Width; + height = swapchain->backend_desc.Height; width = max(width, surface_caps.minImageExtent.width); width = min(width, surface_caps.maxImageExtent.width); height = max(height, surface_caps.minImageExtent.height); height = min(height, surface_caps.maxImageExtent.height);
- if (width != swapchain->desc.Width || height != swapchain->desc.Height) + if (width != swapchain->backend_desc.Width || height != swapchain->backend_desc.Height) { WARN("Swapchain dimensions %ux%u are not supported (%u-%u x %u-%u).\n", - swapchain->desc.Width, swapchain->desc.Height, + swapchain->backend_desc.Width, swapchain->backend_desc.Height, surface_caps.minImageExtent.width, surface_caps.maxImageExtent.width, surface_caps.minImageExtent.height, surface_caps.maxImageExtent.height); } @@ -1751,7 +1781,7 @@ static HRESULT d3d12_swapchain_create_vulkan_swapchain(struct d3d12_swapchain *s usage |= surface_caps.supportedUsageFlags & VK_IMAGE_USAGE_TRANSFER_DST_BIT; if (!(usage & VK_IMAGE_USAGE_TRANSFER_SRC_BIT) || !(usage & VK_IMAGE_USAGE_TRANSFER_DST_BIT)) WARN("Transfer not supported for swapchain images.\n"); - if (swapchain->desc.BufferUsage & DXGI_USAGE_SHADER_INPUT) + if (swapchain->backend_desc.BufferUsage & DXGI_USAGE_SHADER_INPUT) { usage |= surface_caps.supportedUsageFlags & VK_IMAGE_USAGE_SAMPLED_BIT; if (!(usage & VK_IMAGE_USAGE_SAMPLED_BIT)) @@ -1908,7 +1938,7 @@ static void d3d12_swapchain_destroy(struct d3d12_swapchain *swapchain)
LIST_FOR_EACH_ENTRY_SAFE(op, op2, &swapchain->worker_ops, struct d3d12_swapchain_op, entry) { - d3d12_swapchain_op_destroy(op); + d3d12_swapchain_op_destroy(swapchain, op); }
d3d12_swapchain_destroy_vulkan_resources(swapchain); @@ -2409,13 +2439,22 @@ static HRESULT STDMETHODCALLTYPE d3d12_swapchain_GetDesc(IDXGISwapChain4 *iface, return S_OK; }
+static HRESULT d3d12_swapchain_op_resize_buffers_execute(struct d3d12_swapchain *swapchain, struct d3d12_swapchain_op *op) +{ + d3d12_swapchain_destroy_vulkan_resources(swapchain); + + swapchain->backend_desc = op->resize_buffers.desc; + + return d3d12_swapchain_create_vulkan_resources(swapchain); +} + static HRESULT d3d12_swapchain_resize_buffers(struct d3d12_swapchain *swapchain, UINT buffer_count, UINT width, UINT height, DXGI_FORMAT format, UINT flags) { DXGI_SWAP_CHAIN_DESC1 *desc, new_desc; + struct d3d12_swapchain_op *op; unsigned int i; ULONG refcount; - HRESULT hr;
if (flags) FIXME("Ignoring flags %#x.\n", flags); @@ -2459,20 +2498,40 @@ static HRESULT d3d12_swapchain_resize_buffers(struct d3d12_swapchain *swapchain, if (!dxgi_validate_swapchain_desc(&new_desc)) return DXGI_ERROR_INVALID_CALL;
- swapchain->current_buffer_index = 0; - if (desc->Width == new_desc.Width && desc->Height == new_desc.Height && desc->Format == new_desc.Format && desc->BufferCount == new_desc.BufferCount) + { + swapchain->current_buffer_index = 0; return S_OK; + } + + if (!(op = heap_alloc_zero(sizeof(*op)))) + { + WARN("Cannot allocate memory.\n"); + return E_OUTOFMEMORY; + } + + op->type = D3D12_SWAPCHAIN_OP_RESIZE_BUFFERS; + op->resize_buffers.vk_memory = swapchain->vk_memory; + swapchain->vk_memory = VK_NULL_HANDLE; + memcpy(&op->resize_buffers.vk_images, &swapchain->vk_images, sizeof(swapchain->vk_images)); + memset(&swapchain->vk_images, 0, sizeof(swapchain->vk_images)); + op->resize_buffers.desc = new_desc; + + EnterCriticalSection(&swapchain->worker_cs); + list_add_tail(&swapchain->worker_ops, &op->entry); + WakeAllConditionVariable(&swapchain->worker_cv); + LeaveCriticalSection(&swapchain->worker_cs); + + WaitForSingleObject(swapchain->worker_event, INFINITE); + + swapchain->current_buffer_index = 0;
- d3d12_swapchain_destroy_vulkan_resources(swapchain); d3d12_swapchain_destroy_d3d12_resources(swapchain); - swapchain->desc = new_desc;
- if (FAILED(hr = d3d12_swapchain_create_d3d12_resources(swapchain))) - return hr; + swapchain->desc = new_desc;
- return d3d12_swapchain_create_vulkan_resources(swapchain); + return d3d12_swapchain_create_d3d12_resources(swapchain); }
static HRESULT STDMETHODCALLTYPE d3d12_swapchain_ResizeBuffers(IDXGISwapChain4 *iface, @@ -3039,6 +3098,7 @@ static HRESULT d3d12_swapchain_init(struct d3d12_swapchain *swapchain, IWineDXGI
swapchain->window = window; swapchain->desc = *swapchain_desc; + swapchain->backend_desc = *swapchain_desc; swapchain->fullscreen_desc = *fullscreen_desc;
swapchain->present_mode = VK_PRESENT_MODE_FIFO_KHR;
From: Giovanni Mascellani gmascellani@codeweavers.com
--- dlls/dxgi/swapchain.c | 19 ------------------- 1 file changed, 19 deletions(-)
diff --git a/dlls/dxgi/swapchain.c b/dlls/dxgi/swapchain.c index ea93efb5643..81cb6dcd575 100644 --- a/dlls/dxgi/swapchain.c +++ b/dlls/dxgi/swapchain.c @@ -1029,7 +1029,6 @@ struct d3d12_swapchain CONDITION_VARIABLE worker_cv; bool worker_running; struct list worker_ops; - HANDLE worker_event;
/* D3D12 side of the swapchain (frontend): these objects are * visible to the IDXGISwapChain client, so they must never be @@ -1159,9 +1158,6 @@ static DWORD WINAPI d3d12_swapchain_worker_proc(void *data)
d3d12_swapchain_op_destroy(swapchain, op);
- if (!SetEvent(swapchain->worker_event)) - ERR("Cannot set event, last error %ld.\n", GetLastError()); - EnterCriticalSection(&swapchain->worker_cs); } else if (!SleepConditionVariableCS(&swapchain->worker_cv, &swapchain->worker_cs, INFINITE)) @@ -1931,9 +1927,6 @@ static void d3d12_swapchain_destroy(struct d3d12_swapchain *swapchain) ERR("Failed to close worker thread, last error %ld.\n", GetLastError()); }
- if (!CloseHandle(swapchain->worker_event)) - ERR("Failed to close worker event, last error %ld.\n", GetLastError()); - DeleteCriticalSection(&swapchain->worker_cs);
LIST_FOR_EACH_ENTRY_SAFE(op, op2, &swapchain->worker_ops, struct d3d12_swapchain_op, entry) @@ -2243,8 +2236,6 @@ static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, WakeAllConditionVariable(&swapchain->worker_cv); LeaveCriticalSection(&swapchain->worker_cs);
- WaitForSingleObject(swapchain->worker_event, INFINITE); - ++swapchain->frame_number; if ((frame_latency_event = swapchain->frame_latency_event)) { @@ -2523,8 +2514,6 @@ static HRESULT d3d12_swapchain_resize_buffers(struct d3d12_swapchain *swapchain, WakeAllConditionVariable(&swapchain->worker_cv); LeaveCriticalSection(&swapchain->worker_cs);
- WaitForSingleObject(swapchain->worker_event, INFINITE); - swapchain->current_buffer_index = 0;
d3d12_swapchain_destroy_d3d12_resources(swapchain); @@ -3194,14 +3183,6 @@ static HRESULT d3d12_swapchain_init(struct d3d12_swapchain *swapchain, IWineDXGI swapchain->worker_running = true; list_init(&swapchain->worker_ops);
- if (!(swapchain->worker_event = CreateEventA(NULL, FALSE, FALSE, NULL))) - { - hr = HRESULT_FROM_WIN32(GetLastError()); - WARN("Failed to create worker event, hr %#lx.\n", hr); - d3d12_swapchain_destroy(swapchain); - return hr; - } - surface_desc.sType = VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR; surface_desc.pNext = NULL; surface_desc.flags = 0;
From: Giovanni Mascellani gmascellani@codeweavers.com
The implementation even asserts its return value. --- dlls/dxgi/swapchain.c | 25 +++++-------------------- 1 file changed, 5 insertions(+), 20 deletions(-)
diff --git a/dlls/dxgi/swapchain.c b/dlls/dxgi/swapchain.c index 81cb6dcd575..e70e4b042c1 100644 --- a/dlls/dxgi/swapchain.c +++ b/dlls/dxgi/swapchain.c @@ -1663,16 +1663,9 @@ static void d3d12_swapchain_destroy_vulkan_resources(struct d3d12_swapchain *swa
if (swapchain->command_queue) { - if ((vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue))) - { - vk_funcs->p_vkQueueWaitIdle(vk_queue); - - vkd3d_release_vk_queue(swapchain->command_queue); - } - else - { - WARN("Failed to acquire Vulkan queue.\n"); - } + vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue); + vk_funcs->p_vkQueueWaitIdle(vk_queue); + vkd3d_release_vk_queue(swapchain->command_queue); }
if (swapchain->vk_device) @@ -2147,11 +2140,7 @@ static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapch if (FAILED(hr = d3d12_swapchain_set_sync_interval(swapchain, op->present.sync_interval))) return hr;
- if (!(vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue))) - { - ERR("Failed to acquire Vulkan queue.\n"); - return E_FAIL; - } + vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue);
vr = d3d12_swapchain_queue_present(swapchain, vk_queue, op->present.vk_image); if (vr == VK_ERROR_OUT_OF_DATE_KHR) @@ -2164,11 +2153,7 @@ static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapch if (FAILED(hr = d3d12_swapchain_create_vulkan_resources(swapchain))) return hr;
- if (!(vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue))) - { - ERR("Failed to acquire Vulkan queue.\n"); - return E_FAIL; - } + vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue);
if ((vr = d3d12_swapchain_queue_present(swapchain, vk_queue, op->present.vk_image)) < 0) ERR("Failed to present after recreating swapchain, vr %d.\n", vr);
From: Giovanni Mascellani gmascellani@codeweavers.com
--- dlls/dxgi/swapchain.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/dlls/dxgi/swapchain.c b/dlls/dxgi/swapchain.c index e70e4b042c1..d3140c83c39 100644 --- a/dlls/dxgi/swapchain.c +++ b/dlls/dxgi/swapchain.c @@ -2068,13 +2068,14 @@ static HRESULT d3d12_swapchain_set_sync_interval(struct d3d12_swapchain *swapcha return d3d12_swapchain_create_vulkan_resources(swapchain); }
-static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, VkQueue vk_queue, VkImage vk_src_image) +static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, VkImage vk_src_image) { const struct dxgi_vk_funcs *vk_funcs = &swapchain->vk_funcs; VkPresentInfoKHR present_info; VkCommandBuffer vk_cmd_buffer; VkSubmitInfo submit_info; VkImage vk_dst_image; + VkQueue vk_queue; VkResult vr;
if (swapchain->vk_image_index == INVALID_VK_IMAGE_INDEX) @@ -2098,6 +2099,8 @@ static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, vk_cmd_buffer, vk_dst_image, vk_src_image)) < 0 ) return vr;
+ vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue); + submit_info.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; submit_info.pNext = NULL; submit_info.waitSemaphoreCount = 0; @@ -2111,6 +2114,7 @@ static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, if ((vr = vk_funcs->p_vkQueueSubmit(vk_queue, 1, &submit_info, VK_NULL_HANDLE)) < 0) { ERR("Failed to blit swapchain buffer, vr %d.\n", vr); + vkd3d_release_vk_queue(swapchain->command_queue); return vr; }
@@ -2128,39 +2132,32 @@ static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, if ((vr = vk_funcs->p_vkQueuePresentKHR(vk_queue, &present_info)) >= 0) swapchain->vk_image_index = INVALID_VK_IMAGE_INDEX;
+ vkd3d_release_vk_queue(swapchain->command_queue); + return vr; }
static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapchain, struct d3d12_swapchain_op *op) { - VkQueue vk_queue; VkResult vr; HRESULT hr;
if (FAILED(hr = d3d12_swapchain_set_sync_interval(swapchain, op->present.sync_interval))) return hr;
- vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue); - - vr = d3d12_swapchain_queue_present(swapchain, vk_queue, op->present.vk_image); + vr = d3d12_swapchain_queue_present(swapchain, op->present.vk_image); if (vr == VK_ERROR_OUT_OF_DATE_KHR) { - vkd3d_release_vk_queue(swapchain->command_queue); - TRACE("Recreating Vulkan swapchain.\n");
d3d12_swapchain_destroy_vulkan_resources(swapchain); if (FAILED(hr = d3d12_swapchain_create_vulkan_resources(swapchain))) return hr;
- vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue); - - if ((vr = d3d12_swapchain_queue_present(swapchain, vk_queue, op->present.vk_image)) < 0) + if ((vr = d3d12_swapchain_queue_present(swapchain, op->present.vk_image)) < 0) ERR("Failed to present after recreating swapchain, vr %d.\n", vr); }
- vkd3d_release_vk_queue(swapchain->command_queue); - if (vr < 0) { ERR("Failed to queue present, vr %d.\n", vr);
From: Giovanni Mascellani gmascellani@codeweavers.com
The reason is explained in upstream commit b2a1f6b5e4f59fbc7f91ada7e565639dcf4e8e7f, which also applies to earlier vkd3d versions. --- dlls/dxgi/swapchain.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-)
diff --git a/dlls/dxgi/swapchain.c b/dlls/dxgi/swapchain.c index d3140c83c39..64c5958f780 100644 --- a/dlls/dxgi/swapchain.c +++ b/dlls/dxgi/swapchain.c @@ -1056,6 +1056,8 @@ struct d3d12_swapchain VkPresentModeKHR present_mode; DXGI_SWAP_CHAIN_DESC1 backend_desc;
+ ID3D12Fence *present_fence; + uint32_t vk_image_index;
struct dxgi_vk_funcs vk_funcs; @@ -1930,6 +1932,9 @@ static void d3d12_swapchain_destroy(struct d3d12_swapchain *swapchain) d3d12_swapchain_destroy_vulkan_resources(swapchain); d3d12_swapchain_destroy_d3d12_resources(swapchain);
+ if (swapchain->present_fence) + ID3D12Fence_Release(swapchain->present_fence); + if (swapchain->frame_latency_event) CloseHandle(swapchain->frame_latency_event);
@@ -2068,7 +2073,8 @@ static HRESULT d3d12_swapchain_set_sync_interval(struct d3d12_swapchain *swapcha return d3d12_swapchain_create_vulkan_resources(swapchain); }
-static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, VkImage vk_src_image) +static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, VkImage vk_src_image, + uint64_t frame_number) { const struct dxgi_vk_funcs *vk_funcs = &swapchain->vk_funcs; VkPresentInfoKHR present_info; @@ -2077,6 +2083,7 @@ static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, VkImage vk_dst_image; VkQueue vk_queue; VkResult vr; + HRESULT hr;
if (swapchain->vk_image_index == INVALID_VK_IMAGE_INDEX) { @@ -2099,6 +2106,12 @@ static VkResult d3d12_swapchain_queue_present(struct d3d12_swapchain *swapchain, vk_cmd_buffer, vk_dst_image, vk_src_image)) < 0 ) return vr;
+ if (FAILED(hr = ID3D12Fence_SetEventOnCompletion(swapchain->present_fence, frame_number + 1, NULL))) + { + ERR("Failed to wait for present event, hr %#lx.\n", hr); + return VK_ERROR_UNKNOWN; + } + vk_queue = vkd3d_acquire_vk_queue(swapchain->command_queue);
submit_info.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; @@ -2145,7 +2158,7 @@ static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapch if (FAILED(hr = d3d12_swapchain_set_sync_interval(swapchain, op->present.sync_interval))) return hr;
- vr = d3d12_swapchain_queue_present(swapchain, op->present.vk_image); + vr = d3d12_swapchain_queue_present(swapchain, op->present.vk_image, op->present.frame_number); if (vr == VK_ERROR_OUT_OF_DATE_KHR) { TRACE("Recreating Vulkan swapchain.\n"); @@ -2154,7 +2167,7 @@ static HRESULT d3d12_swapchain_op_present_execute(struct d3d12_swapchain *swapch if (FAILED(hr = d3d12_swapchain_create_vulkan_resources(swapchain))) return hr;
- if ((vr = d3d12_swapchain_queue_present(swapchain, op->present.vk_image)) < 0) + if ((vr = d3d12_swapchain_queue_present(swapchain, op->present.vk_image, op->present.frame_number)) < 0) ERR("Failed to present after recreating swapchain, vr %d.\n", vr); }
@@ -2233,6 +2246,13 @@ static HRESULT d3d12_swapchain_present(struct d3d12_swapchain *swapchain, } }
+ if (FAILED(hr = ID3D12CommandQueue_Signal(swapchain->command_queue, + swapchain->present_fence, swapchain->frame_number))) + { + ERR("Failed to signal present fence, hf %#lx.\n", hr); + return hr; + } + swapchain->current_buffer_index = (swapchain->current_buffer_index + 1) % swapchain->desc.BufferCount; return S_OK; } @@ -3236,6 +3256,14 @@ static HRESULT d3d12_swapchain_init(struct d3d12_swapchain *swapchain, IWineDXGI } }
+ if (FAILED(hr = ID3D12Device_CreateFence(device, 0, 0, + &IID_ID3D12Fence, (void **)&swapchain->present_fence))) + { + WARN("Failed to create present fence, hr %#lx.\n", hr); + d3d12_swapchain_destroy(swapchain); + return hr; + } + IWineDXGIFactory_AddRef(swapchain->factory = factory);
if (!(swapchain->worker_thread = CreateThread(NULL, 0, d3d12_swapchain_worker_proc, swapchain, 0, NULL)))
This merge request was approved by Zebediah Figura.
I don't 100% understand why we need the per swapchain threads, is it because there's no asynchronous APIs in Vulkan for all the needed waits that we could use single threadedly? (whether from a vkd3d background thread, if there's one, or a dedicated thread) That was my takeaway from our conversation, but I'm not certain I got it right.
Is allocating the ops on the heap a potential performance concern? I recall removing allocations in wined3d and using separate heaps because the global heap lock was getting pretty contended with some games. Ideally it would be just a flat array queue, but if there's only a couple ops per frame and we don't care that much, a low effort thing to do would be to keep a free list.
I don't 100% understand why we need the per swapchain threads, is it because there's no asynchronous APIs in Vulkan for all the needed waits that we could use single threadedly? (whether from a vkd3d background thread, if there's one, or a dedicated thread) That was my takeaway from our conversation, but I'm not certain I got it right.
Yeah, that's the reason. Vulkan has `VkFence` and `vkWaitForFences()` to wait on the first one which becomes signaled, but unfortunately not all Vulkan operations that can wait expose the end of the wait with a `VkFence`. Specifically, `vkAcquireNextImageKHR()` doesn't (it *also* takes a fence to signal, but the call itself might wait). So you can't use a thread for more than one swapchain, because you don't know which one will be the first to have an available image.
`vkWaitForPresentKHR()` has a similar problem. We don't use it yet, but we'll probably want to in the future.
This looks like an oversight in the Vulkan design. Extension `VK_KHR_present_wait` [offers some criteria for that choice in its "Issues" section](https://registry.khronos.org/vulkan/specs/1.3-extensions/html/chap54.html#_i...). At any rate, we're bound to what the specs allow us.
Is allocating the ops on the heap a potential performance concern? I recall removing allocations in wined3d and using separate heaps because the global heap lock was getting pretty contended with some games. Ideally it would be just a flat array queue, but if there's only a couple ops per frame and we don't care that much, a low effort thing to do would be to keep a free list.
Yeah, I expect that in normal cirumstances this only requires on allocation per frame, which I guess should be relatively easy to handle (I expect wined3d to be potentially much more stressed). My feeling is that this shouldn't be a problem, and that it can be addressed later if it shows up.
This merge request was approved by Jan Sikorski.