This series aims to improve GPU-side performance by avoiding VK_IMAGE_LAYOUT_GENERAL for textures that are used as render target and shader resource view. To do so, we have to transition them between VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL and VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL depending on use (and likewise for depth stencils).
It improves performance of Rocket League from 80 fps to 100 fps in a GPU limited configuration on my Radeon Polaris GPU.
This MR is marked as draft for now because I am not convinced by patch 3 yet. For actual submission I think I'll create separate MRs for patches 1-3 and 4-8.
Patch 2 introduces a validation layer error in the d3d11 tests that gets fixed in patch 3. No new test failures are introduced, although none of the existing ones are fixed either.
From: Stefan Dösinger stefan@codeweavers.com
--- dlls/wined3d/swapchain.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/dlls/wined3d/swapchain.c b/dlls/wined3d/swapchain.c index 337c0410d39..bd4472d746a 100644 --- a/dlls/wined3d/swapchain.c +++ b/dlls/wined3d/swapchain.c @@ -1163,6 +1163,7 @@ static void wined3d_swapchain_vk_rotate(struct wined3d_swapchain *swapchain, str struct wined3d_image_vk image0; VkDescriptorImageInfo vk_info0; VkImageLayout vk_layout0; + uint32_t bind_mask0; DWORD locations0; unsigned int i;
@@ -1176,6 +1177,7 @@ static void wined3d_swapchain_vk_rotate(struct wined3d_swapchain *swapchain, str /* Back buffer 0 is already in the draw binding. */ image0 = texture_prev->image; vk_layout0 = texture_prev->layout; + bind_mask0 = texture_prev->bind_mask; vk_info0 = texture_prev->default_image_info; locations0 = texture_prev->t.sub_resources[0].locations;
@@ -1189,6 +1191,7 @@ static void wined3d_swapchain_vk_rotate(struct wined3d_swapchain *swapchain, str
texture_prev->image = texture->image; texture_prev->layout = texture->layout; + texture_prev->bind_mask = texture->bind_mask; texture_prev->default_image_info = texture->default_image_info;
wined3d_texture_validate_location(&texture_prev->t, 0, sub_resource->locations & supported_locations); @@ -1199,6 +1202,7 @@ static void wined3d_swapchain_vk_rotate(struct wined3d_swapchain *swapchain, str
texture_prev->image = image0; texture_prev->layout = vk_layout0; + texture_prev->bind_mask = bind_mask0; texture_prev->default_image_info = vk_info0;
wined3d_texture_validate_location(&texture_prev->t, 0, locations0 & supported_locations);
From: Stefan Dösinger stefan@codeweavers.com
There are a few things I am not quite sure about:
1) Assigning VkDescriptorImageInfo.imageLayout
Is wined3d_texture_vk_get_default_image_info only ever used for shader resource views, or might it also be used for render target / depth stencil?
2) Using a resource as DS/RT and shader resource at the same time is handled by the next patch. I have split this up because I think it makes the patches easier to read, but it introduces a temporary validation failure. --- dlls/wined3d/texture.c | 87 +++++++++++++++++++++++++++++------------- dlls/wined3d/view.c | 5 ++- 2 files changed, 65 insertions(+), 27 deletions(-)
diff --git a/dlls/wined3d/texture.c b/dlls/wined3d/texture.c index 33b938ad460..c76a5573fd9 100644 --- a/dlls/wined3d/texture.c +++ b/dlls/wined3d/texture.c @@ -4903,10 +4903,13 @@ const VkDescriptorImageInfo *wined3d_texture_vk_get_default_image_info(struct wi return NULL; }
- TRACE("Created image view 0x%s.\n", wine_dbgstr_longlong(texture_vk->default_image_info.imageView)); + TRACE("Created image view 0x%s, texture %p.\n", wine_dbgstr_longlong(texture_vk->default_image_info.imageView), &texture_vk->t);
texture_vk->default_image_info.sampler = VK_NULL_HANDLE; - texture_vk->default_image_info.imageLayout = texture_vk->layout; + if (texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) + texture_vk->default_image_info.imageLayout = VK_IMAGE_LAYOUT_GENERAL; + else + texture_vk->default_image_info.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
return &texture_vk->default_image_info; } @@ -5265,7 +5268,7 @@ static void wined3d_texture_vk_download_data(struct wined3d_context *context, VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_ACCESS_TRANSFER_READ_BIT, vk_access_mask_from_bind_flags(src_texture_vk->t.resource.bind_flags), - VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, src_texture_vk->layout, + VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, src_texture_vk->layout, src_texture_vk->image.vk_image, &vk_range);
wined3d_context_vk_reference_texture(context_vk, src_texture_vk); @@ -5511,26 +5514,18 @@ BOOL wined3d_texture_vk_prepare_texture(struct wined3d_texture_vk *texture_vk, if (resource->bind_flags & WINED3D_BIND_UNORDERED_ACCESS) vk_usage |= VK_IMAGE_USAGE_STORAGE_BIT;
- texture_vk->layout = VK_IMAGE_LAYOUT_GENERAL; - if (wined3d_popcount(resource->bind_flags) == 1) + if (resource->bind_flags & WINED3D_BIND_UNORDERED_ACCESS) + texture_vk->layout = VK_IMAGE_LAYOUT_GENERAL; + else if (resource->bind_flags & WINED3D_BIND_RENDER_TARGET) + texture_vk->layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; + else if (resource->bind_flags & WINED3D_BIND_DEPTH_STENCIL) + texture_vk->layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; + else if (resource->bind_flags & WINED3D_BIND_SHADER_RESOURCE) + texture_vk->layout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; + else { - switch (resource->bind_flags) - { - case WINED3D_BIND_RENDER_TARGET: - texture_vk->layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; - break; - - case WINED3D_BIND_DEPTH_STENCIL: - texture_vk->layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; - break; - - case WINED3D_BIND_SHADER_RESOURCE: - texture_vk->layout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; - break; - - default: - break; - } + FIXME("unexpected bind flags %s, using VK_IMAGE_LAYOUT_GENERAL\n", wined3d_debug_bind_flags(resource->bind_flags)); + texture_vk->layout = VK_IMAGE_LAYOUT_GENERAL; }
if (!wined3d_context_vk_create_image(context_vk, vk_image_type, vk_usage, format_vk->vk_format, @@ -5540,6 +5535,10 @@ BOOL wined3d_texture_vk_prepare_texture(struct wined3d_texture_vk *texture_vk, return FALSE; }
+ /* We can't use a zero src access mask without synchronization2. Set the last-used bind mask to something + * non-zero to avoid this. */ + texture_vk->bind_mask = resource->bind_flags; + vk_range.aspectMask = vk_aspect_mask_from_format(&format_vk->f); vk_range.baseMipLevel = 0; vk_range.levelCount = VK_REMAINING_MIP_LEVELS; @@ -5706,15 +5705,48 @@ HRESULT wined3d_texture_vk_init(struct wined3d_texture_vk *texture_vk, struct wi flags, device, parent, parent_ops, &texture_vk[1], &wined3d_texture_vk_ops); }
+enum VkImageLayout wined3d_layout_from_bind_mask(const struct wined3d_texture_vk *texture_vk, const uint32_t bind_mask) +{ + assert(wined3d_popcount(bind_mask) == 1); + + /* We want to avoid switching between LAYOUT_GENERAL and other layouts. In Radeon GPUs (and presumably + * others), this will trigger decompressing and recompressing the texture. We also hardcode the layout + * into views when they are created. */ + if (texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) + return VK_IMAGE_LAYOUT_GENERAL; + + switch (bind_mask) + { + case WINED3D_BIND_RENDER_TARGET: + return VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; + + case WINED3D_BIND_DEPTH_STENCIL: + return VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; + + case WINED3D_BIND_SHADER_RESOURCE: + return VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; + + default: + ERR("Unexpected bind mask %s.\n", wined3d_debug_bind_flags(bind_mask)); + return VK_IMAGE_LAYOUT_GENERAL; + } +} + void wined3d_texture_vk_barrier(struct wined3d_texture_vk *texture_vk, struct wined3d_context_vk *context_vk, uint32_t bind_mask) { + enum VkImageLayout new_layout; uint32_t src_bind_mask = 0;
TRACE("texture_vk %p, context_vk %p, bind_mask %s.\n", texture_vk, context_vk, wined3d_debug_bind_flags(bind_mask));
- if (bind_mask & ~WINED3D_READ_ONLY_BIND_MASK) + new_layout = wined3d_layout_from_bind_mask(texture_vk, bind_mask); + + /* A layout transition is potentially a read-write operation, so even if we + * prepare the texture to e.g. read only shader resource mode, we have to wait + * for past operations to finish. */ + if (bind_mask & ~WINED3D_READ_ONLY_BIND_MASK || new_layout != texture_vk->layout) { src_bind_mask = texture_vk->bind_mask & WINED3D_READ_ONLY_BIND_MASK; if (!src_bind_mask) @@ -5732,8 +5764,9 @@ void wined3d_texture_vk_barrier(struct wined3d_texture_vk *texture_vk, { VkImageSubresourceRange vk_range;
- TRACE(" %s -> %s.\n", - wined3d_debug_bind_flags(src_bind_mask), wined3d_debug_bind_flags(bind_mask)); + TRACE(" %s(%x) -> %s(%x).\n", + wined3d_debug_bind_flags(src_bind_mask), texture_vk->layout, + wined3d_debug_bind_flags(bind_mask), new_layout);
vk_range.aspectMask = vk_aspect_mask_from_format(texture_vk->t.resource.format); vk_range.baseMipLevel = 0; @@ -5745,7 +5778,9 @@ void wined3d_texture_vk_barrier(struct wined3d_texture_vk *texture_vk, vk_pipeline_stage_mask_from_bind_flags(src_bind_mask), vk_pipeline_stage_mask_from_bind_flags(bind_mask), vk_access_mask_from_bind_flags(src_bind_mask), vk_access_mask_from_bind_flags(bind_mask), - texture_vk->layout, texture_vk->layout, texture_vk->image.vk_image, &vk_range); + texture_vk->layout, new_layout, texture_vk->image.vk_image, &vk_range); + + texture_vk->layout = new_layout; } }
diff --git a/dlls/wined3d/view.c b/dlls/wined3d/view.c index 11f099b7ec5..d22edd5ce58 100644 --- a/dlls/wined3d/view.c +++ b/dlls/wined3d/view.c @@ -1188,7 +1188,10 @@ static void wined3d_shader_resource_view_vk_cs_init(void *object)
srv_vk->view_vk.u.vk_image_info.imageView = vk_image_view; srv_vk->view_vk.u.vk_image_info.sampler = VK_NULL_HANDLE; - srv_vk->view_vk.u.vk_image_info.imageLayout = texture_vk->layout; + if (texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) + srv_vk->view_vk.u.vk_image_info.imageLayout = VK_IMAGE_LAYOUT_GENERAL; + else + srv_vk->view_vk.u.vk_image_info.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; }
HRESULT wined3d_shader_resource_view_vk_init(struct wined3d_shader_resource_view_vk *view_vk,
From: Stefan Dösinger stefan@codeweavers.com
The way I am updating the view could be made a bit prettier --- dlls/wined3d/context_vk.c | 20 ++++++++++++++++++++ dlls/wined3d/texture.c | 30 ++++++++++++++++++++++++++++++ dlls/wined3d/view.c | 8 ++++++++ dlls/wined3d/wined3d_private.h | 2 ++ 4 files changed, 60 insertions(+)
diff --git a/dlls/wined3d/context_vk.c b/dlls/wined3d/context_vk.c index 1acb8a8d201..8b676eb9d5c 100644 --- a/dlls/wined3d/context_vk.c +++ b/dlls/wined3d/context_vk.c @@ -2599,6 +2599,14 @@ static bool wined3d_context_vk_begin_render_pass(struct wined3d_context_vk *cont continue;
rtv_vk = wined3d_rendertarget_view_vk(view); + + if (rtv_vk->v.resource->bind_count) + { + struct wined3d_texture_vk *texture_vk; + texture_vk = wined3d_texture_vk(wined3d_texture_from_resource(rtv_vk->v.resource)); + wined3d_texture_vk_make_generic(texture_vk, context_vk); + } + vk_views[attachment_count] = wined3d_rendertarget_view_vk_get_image_view(rtv_vk, context_vk); wined3d_rendertarget_view_vk_barrier(rtv_vk, context_vk, WINED3D_BIND_RENDER_TARGET); wined3d_context_vk_reference_rendertarget_view(context_vk, rtv_vk); @@ -2634,6 +2642,14 @@ static bool wined3d_context_vk_begin_render_pass(struct wined3d_context_vk *cont if ((view = state->fb.depth_stencil)) { rtv_vk = wined3d_rendertarget_view_vk(view); + + if (rtv_vk->v.resource->bind_count) + { + struct wined3d_texture_vk *texture_vk; + texture_vk = wined3d_texture_vk(wined3d_texture_from_resource(rtv_vk->v.resource)); + wined3d_texture_vk_make_generic(texture_vk, context_vk); + } + vk_views[attachment_count] = wined3d_rendertarget_view_vk_get_image_view(rtv_vk, context_vk); wined3d_rendertarget_view_vk_barrier(rtv_vk, context_vk, WINED3D_BIND_DEPTH_STENCIL); wined3d_context_vk_reference_rendertarget_view(context_vk, rtv_vk); @@ -3023,7 +3039,11 @@ static bool wined3d_shader_descriptor_writes_vk_add_srv_write(struct wined3d_sha struct wined3d_texture_vk *texture_vk = wined3d_texture_vk(texture_from_resource(resource));
if (view_vk->u.vk_image_info.imageView) + { image_info = &view_vk->u.vk_image_info; + if (image_info->imageLayout != texture_vk->layout) + wined3d_shader_resource_view_vk_update(srv_vk, context_vk); + } else image_info = wined3d_texture_vk_get_default_image_info(texture_vk, context_vk); buffer_view = NULL; diff --git a/dlls/wined3d/texture.c b/dlls/wined3d/texture.c index c76a5573fd9..2f832d030c3 100644 --- a/dlls/wined3d/texture.c +++ b/dlls/wined3d/texture.c @@ -5784,6 +5784,36 @@ void wined3d_texture_vk_barrier(struct wined3d_texture_vk *texture_vk, } }
+/* This is called when a texture is used as render target and shader resource + * or depth stencil and shader resource at the same time. This can either be + * read-only simultaneos use as depth stencil, but also for rendering to one + * subresource while reading from another. Without tracking of barriers and + * layouts per subresource VK_IMAGE_LAYOUT_GENERAL is the only thing we can do. */ +void wined3d_texture_vk_make_generic(struct wined3d_texture_vk *texture_vk, + struct wined3d_context_vk *context_vk) +{ + VkImageSubresourceRange vk_range; + + if (texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) + return; + + vk_range.aspectMask = vk_aspect_mask_from_format(texture_vk->t.resource.format); + vk_range.baseMipLevel = 0; + vk_range.levelCount = VK_REMAINING_MIP_LEVELS; + vk_range.baseArrayLayer = 0; + vk_range.layerCount = VK_REMAINING_ARRAY_LAYERS; + + wined3d_context_vk_image_barrier(context_vk, wined3d_context_vk_get_command_buffer(context_vk), + VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, + VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT, + 0, 0, + texture_vk->layout, VK_IMAGE_LAYOUT_GENERAL, texture_vk->image.vk_image, &vk_range); + + texture_vk->layout = VK_IMAGE_LAYOUT_GENERAL; + /* FIXME: Can I really do this that easily? */ + texture_vk->default_image_info.imageLayout = VK_IMAGE_LAYOUT_GENERAL; +} + static void ffp_blitter_destroy(struct wined3d_blitter *blitter, struct wined3d_context *context) { struct wined3d_blitter *next; diff --git a/dlls/wined3d/view.c b/dlls/wined3d/view.c index d22edd5ce58..3a5efe5368b 100644 --- a/dlls/wined3d/view.c +++ b/dlls/wined3d/view.c @@ -1109,6 +1109,14 @@ void wined3d_shader_resource_view_vk_update(struct wined3d_shader_resource_view_ struct wined3d_buffer_vk *buffer_vk; VkBufferView vk_buffer_view;
+ if (resource->type != WINED3D_RTYPE_BUFFER) + { + struct wined3d_texture *texture = wined3d_texture_from_resource(resource); + const struct wined3d_texture_vk *texture_vk = wined3d_texture_vk(texture); + srv_vk->view_vk.u.vk_image_info.imageLayout = texture_vk->layout; + return; + } + buffer_vk = wined3d_buffer_vk(buffer_from_resource(resource)); wined3d_context_vk_destroy_vk_buffer_view(context_vk, view_vk->u.vk_buffer_view, view_vk->command_buffer_id); if ((vk_buffer_view = wined3d_view_vk_create_vk_buffer_view(context_vk, desc, buffer_vk, view_format_vk))) diff --git a/dlls/wined3d/wined3d_private.h b/dlls/wined3d/wined3d_private.h index 2d420a60369..bdf871a5f1d 100644 --- a/dlls/wined3d/wined3d_private.h +++ b/dlls/wined3d/wined3d_private.h @@ -4876,6 +4876,8 @@ const VkDescriptorImageInfo *wined3d_texture_vk_get_default_image_info(struct wi HRESULT wined3d_texture_vk_init(struct wined3d_texture_vk *texture_vk, struct wined3d_device *device, const struct wined3d_resource_desc *desc, unsigned int layer_count, unsigned int level_count, uint32_t flags, void *parent, const struct wined3d_parent_ops *parent_ops) DECLSPEC_HIDDEN; +void wined3d_texture_vk_make_generic(struct wined3d_texture_vk *texture_vk, + struct wined3d_context_vk *context_vk) DECLSPEC_HIDDEN; BOOL wined3d_texture_vk_prepare_texture(struct wined3d_texture_vk *texture_vk, struct wined3d_context_vk *context_vk) DECLSPEC_HIDDEN;
From: Stefan Dösinger stefan@codeweavers.com
---
Post-blit changes will come in the next patches. --- dlls/wined3d/swapchain.c | 2 +- dlls/wined3d/texture.c | 10 +++++----- dlls/wined3d/view.c | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/dlls/wined3d/swapchain.c b/dlls/wined3d/swapchain.c index bd4472d746a..ab36b8bc1f1 100644 --- a/dlls/wined3d/swapchain.c +++ b/dlls/wined3d/swapchain.c @@ -1086,7 +1086,7 @@ static VkResult wined3d_swapchain_vk_blit(struct wined3d_swapchain_vk *swapchain
wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(back_buffer_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(back_buffer_vk->bind_mask), VK_ACCESS_TRANSFER_READ_BIT, back_buffer_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, back_buffer_vk->image.vk_image, &vk_range); diff --git a/dlls/wined3d/texture.c b/dlls/wined3d/texture.c index 2f832d030c3..54dd24e0f5c 100644 --- a/dlls/wined3d/texture.c +++ b/dlls/wined3d/texture.c @@ -5066,7 +5066,7 @@ static void wined3d_texture_vk_upload_data(struct wined3d_context *context,
wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(dst_texture_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(dst_texture_vk->bind_mask), VK_ACCESS_TRANSFER_WRITE_BIT, dst_texture_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, dst_texture_vk->image.vk_image, &vk_range); @@ -5242,7 +5242,7 @@ static void wined3d_texture_vk_download_data(struct wined3d_context *context,
wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(src_texture_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(src_texture_vk->bind_mask), VK_ACCESS_TRANSFER_READ_BIT, src_texture_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, src_texture_vk->image.vk_image, &vk_range); @@ -5364,7 +5364,7 @@ static bool wined3d_texture_vk_clear(struct wined3d_texture_vk *texture_vk,
wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(texture_vk->t.resource.bind_flags), VK_ACCESS_TRANSFER_WRITE_BIT, + vk_access_mask_from_bind_flags(texture_vk->bind_mask), VK_ACCESS_TRANSFER_WRITE_BIT, texture_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, vk_image, &vk_range);
if (format->depth_size || format->stencil_size) @@ -7187,12 +7187,12 @@ static DWORD vk_blitter_blit(struct wined3d_blitter *blitter, enum wined3d_blit_
wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(src_texture_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(src_texture_vk->bind_mask), VK_ACCESS_TRANSFER_READ_BIT, src_texture_vk->layout, src_layout, src_texture_vk->image.vk_image, &vk_src_range); wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(dst_texture_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(dst_texture_vk->bind_mask), VK_ACCESS_TRANSFER_WRITE_BIT, dst_texture_vk->layout, dst_layout, dst_texture_vk->image.vk_image, &vk_dst_range);
diff --git a/dlls/wined3d/view.c b/dlls/wined3d/view.c index 3a5efe5368b..4cf0d43901f 100644 --- a/dlls/wined3d/view.c +++ b/dlls/wined3d/view.c @@ -1399,13 +1399,13 @@ void wined3d_shader_resource_view_vk_generate_mipmap(struct wined3d_shader_resou
wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(texture_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(texture_vk->bind_mask), VK_ACCESS_TRANSFER_READ_BIT, texture_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, texture_vk->image.vk_image, &vk_src_range); wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(texture_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(texture_vk->bind_mask), VK_ACCESS_TRANSFER_WRITE_BIT, texture_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, texture_vk->image.vk_image, &vk_dst_range); @@ -1467,7 +1467,7 @@ void wined3d_shader_resource_view_vk_generate_mipmap(struct wined3d_shader_resou texture_vk->image.vk_image, &vk_src_range); wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(texture_vk->t.resource.bind_flags), + vk_access_mask_from_bind_flags(texture_vk->bind_mask), VK_ACCESS_TRANSFER_WRITE_BIT, texture_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, texture_vk->image.vk_image, &vk_dst_range);
From: Stefan Dösinger stefan@codeweavers.com
Before submitting this in a non-draft MR I'll split out the LAYOUT_GENERAL reuse into a separate patch. --- dlls/wined3d/resource.c | 4 ++++ dlls/wined3d/texture.c | 40 +++++++++++++++++++++++++--------- dlls/wined3d/utils.c | 1 + dlls/wined3d/wined3d_private.h | 2 ++ 4 files changed, 37 insertions(+), 10 deletions(-)
diff --git a/dlls/wined3d/resource.c b/dlls/wined3d/resource.c index e2100627198..48a69c9978a 100644 --- a/dlls/wined3d/resource.c +++ b/dlls/wined3d/resource.c @@ -567,6 +567,8 @@ VkAccessFlags vk_access_mask_from_bind_flags(uint32_t bind_flags) flags |= VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT | VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT; if (bind_flags & WINED3D_BIND_STREAM_OUTPUT) flags |= VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT; + if (bind_flags & WINED3D_BIND_TRANSFER_DST) + flags |= VK_ACCESS_TRANSFER_WRITE_BIT;
return flags; } @@ -589,6 +591,8 @@ VkPipelineStageFlags vk_pipeline_stage_mask_from_bind_flags(uint32_t bind_flags) flags |= VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT; if (bind_flags & WINED3D_BIND_STREAM_OUTPUT) flags |= VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT; + if (bind_flags & WINED3D_BIND_TRANSFER_DST) + flags |= VK_PIPELINE_STAGE_TRANSFER_BIT;
return flags; } diff --git a/dlls/wined3d/texture.c b/dlls/wined3d/texture.c index 54dd24e0f5c..41b883d4343 100644 --- a/dlls/wined3d/texture.c +++ b/dlls/wined3d/texture.c @@ -5326,6 +5326,7 @@ static bool wined3d_texture_vk_clear(struct wined3d_texture_vk *texture_vk, VkImageSubresourceRange vk_range; VkClearColorValue colour_value; VkImageAspectFlags aspect_mask; + VkImageLayout layout; VkImage vk_image;
if (texture_vk->t.resource.format_attrs & WINED3D_FORMAT_ATTR_COMPRESSED) @@ -5362,29 +5363,45 @@ static bool wined3d_texture_vk_clear(struct wined3d_texture_vk *texture_vk,
wined3d_context_vk_end_current_render_pass(context_vk);
- wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, - VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(texture_vk->bind_mask), VK_ACCESS_TRANSFER_WRITE_BIT, - texture_vk->layout, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, vk_image, &vk_range); + if (texture_vk->t.level_count != 1 || texture_vk->t.layer_count != 1) + { + wined3d_texture_vk_barrier(texture_vk, context_vk, WINED3D_BIND_TRANSFER_DST); + layout = texture_vk->layout; + } + else + { + if (texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) + layout = VK_IMAGE_LAYOUT_GENERAL; + else + layout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL; + + wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, + VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, + vk_access_mask_from_bind_flags(texture_vk->bind_mask), VK_ACCESS_TRANSFER_WRITE_BIT, + texture_vk->layout, layout, vk_image, &vk_range); + }
if (format->depth_size || format->stencil_size) { depth_value.depth = sub_resource->clear_value.depth; depth_value.stencil = sub_resource->clear_value.stencil; VK_CALL(vkCmdClearDepthStencilImage(vk_command_buffer, vk_image, - VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, &depth_value, 1, &vk_range)); + layout, &depth_value, 1, &vk_range)); } else { wined3d_format_colour_to_vk(format, &sub_resource->clear_value.colour, &colour_value); VK_CALL(vkCmdClearColorImage(vk_command_buffer, vk_image, - VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, &colour_value, 1, &vk_range)); + layout, &colour_value, 1, &vk_range)); }
- wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, - VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, - VK_ACCESS_TRANSFER_WRITE_BIT, vk_access_mask_from_bind_flags(texture_vk->t.resource.bind_flags), - VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, texture_vk->layout, vk_image, &vk_range); + if (texture_vk->t.level_count != 1 || texture_vk->t.layer_count != 1) + { + wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, + VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, + VK_ACCESS_TRANSFER_WRITE_BIT, vk_access_mask_from_bind_flags(texture_vk->t.resource.bind_flags), + layout, texture_vk->layout, vk_image, &vk_range); + } wined3d_context_vk_reference_texture(context_vk, texture_vk);
wined3d_texture_validate_location(&texture_vk->t, sub_resource_idx, WINED3D_LOCATION_TEXTURE_RGB); @@ -5726,6 +5743,9 @@ enum VkImageLayout wined3d_layout_from_bind_mask(const struct wined3d_texture_vk case WINED3D_BIND_SHADER_RESOURCE: return VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
+ case WINED3D_BIND_TRANSFER_DST: + return VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL; + default: ERR("Unexpected bind mask %s.\n", wined3d_debug_bind_flags(bind_mask)); return VK_IMAGE_LAYOUT_GENERAL; diff --git a/dlls/wined3d/utils.c b/dlls/wined3d/utils.c index aff2b00e08d..357adda5e2e 100644 --- a/dlls/wined3d/utils.c +++ b/dlls/wined3d/utils.c @@ -4896,6 +4896,7 @@ const char *wined3d_debug_bind_flags(uint32_t bind_flags) BIND_FLAG_TO_STR(WINED3D_BIND_DEPTH_STENCIL); BIND_FLAG_TO_STR(WINED3D_BIND_UNORDERED_ACCESS); BIND_FLAG_TO_STR(WINED3D_BIND_INDIRECT_BUFFER); + BIND_FLAG_TO_STR(WINED3D_BIND_TRANSFER_DST); #undef BIND_FLAG_TO_STR if (bind_flags) FIXME("Unrecognised bind flag(s) %#x.\n", bind_flags); diff --git a/dlls/wined3d/wined3d_private.h b/dlls/wined3d/wined3d_private.h index bdf871a5f1d..7e0ea489ea6 100644 --- a/dlls/wined3d/wined3d_private.h +++ b/dlls/wined3d/wined3d_private.h @@ -338,6 +338,8 @@ struct min_lookup extern const struct min_lookup minMipLookup[WINED3D_TEXF_LINEAR + 1] DECLSPEC_HIDDEN; extern const GLenum magLookup[WINED3D_TEXF_LINEAR + 1] DECLSPEC_HIDDEN;
+#define WINED3D_BIND_TRANSFER_DST 0x10000000 + static const uint32_t WINED3D_READ_ONLY_BIND_MASK = WINED3D_BIND_VERTEX_BUFFER | WINED3D_BIND_INDEX_BUFFER | WINED3D_BIND_CONSTANT_BUFFER | WINED3D_BIND_SHADER_RESOURCE | WINED3D_BIND_INDIRECT_BUFFER;
From: Stefan Dösinger stefan@codeweavers.com
This patch, together with "wined3d: Avoid barriers between the same write type", increases performance in Rocket League by about 3%. No deeply scientific benchmark, but the patch does have an impact. --- dlls/wined3d/resource.c | 4 +- dlls/wined3d/texture.c | 83 +++++++++++++++++++++++----------- dlls/wined3d/utils.c | 1 + dlls/wined3d/wined3d_private.h | 4 +- 4 files changed, 64 insertions(+), 28 deletions(-)
diff --git a/dlls/wined3d/resource.c b/dlls/wined3d/resource.c index 48a69c9978a..e7b84e9a035 100644 --- a/dlls/wined3d/resource.c +++ b/dlls/wined3d/resource.c @@ -569,6 +569,8 @@ VkAccessFlags vk_access_mask_from_bind_flags(uint32_t bind_flags) flags |= VK_ACCESS_TRANSFORM_FEEDBACK_WRITE_BIT_EXT; if (bind_flags & WINED3D_BIND_TRANSFER_DST) flags |= VK_ACCESS_TRANSFER_WRITE_BIT; + if (bind_flags & WINED3D_BIND_TRANSFER_SRC) + flags |= VK_ACCESS_TRANSFER_READ_BIT;
return flags; } @@ -591,7 +593,7 @@ VkPipelineStageFlags vk_pipeline_stage_mask_from_bind_flags(uint32_t bind_flags) flags |= VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT; if (bind_flags & WINED3D_BIND_STREAM_OUTPUT) flags |= VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT; - if (bind_flags & WINED3D_BIND_TRANSFER_DST) + if (bind_flags & (WINED3D_BIND_TRANSFER_DST | WINED3D_BIND_TRANSFER_SRC)) flags |= VK_PIPELINE_STAGE_TRANSFER_BIT;
return flags; diff --git a/dlls/wined3d/texture.c b/dlls/wined3d/texture.c index 41b883d4343..d246126eab5 100644 --- a/dlls/wined3d/texture.c +++ b/dlls/wined3d/texture.c @@ -5746,6 +5746,9 @@ enum VkImageLayout wined3d_layout_from_bind_mask(const struct wined3d_texture_vk case WINED3D_BIND_TRANSFER_DST: return VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
+ case WINED3D_BIND_TRANSFER_SRC: + return VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL; + default: ERR("Unexpected bind mask %s.\n", wined3d_debug_bind_flags(bind_mask)); return VK_IMAGE_LAYOUT_GENERAL; @@ -7195,26 +7198,43 @@ static DWORD vk_blitter_blit(struct wined3d_blitter *blitter, enum wined3d_blit_ goto next; }
- if (src_texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) - src_layout = VK_IMAGE_LAYOUT_GENERAL; + if (src_texture->layer_count == 1 && src_texture->level_count == 1) + { + wined3d_texture_vk_barrier(src_texture_vk, context_vk, WINED3D_BIND_TRANSFER_SRC); + src_layout = src_texture_vk->layout; + } else - src_layout = VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL; + { + if (src_texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) + src_layout = VK_IMAGE_LAYOUT_GENERAL; + else + src_layout = VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL;
- if (dst_texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) - dst_layout = VK_IMAGE_LAYOUT_GENERAL; + wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, + VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, + vk_access_mask_from_bind_flags(src_texture_vk->bind_mask), + VK_ACCESS_TRANSFER_READ_BIT, src_texture_vk->layout, src_layout, + src_texture_vk->image.vk_image, &vk_src_range); + } + + if (dst_texture->layer_count == 1 && dst_texture->level_count == 1) + { + wined3d_texture_vk_barrier(dst_texture_vk, context_vk, WINED3D_BIND_TRANSFER_DST); + dst_layout = dst_texture_vk->layout; + } else - dst_layout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL; + { + if (dst_texture_vk->layout == VK_IMAGE_LAYOUT_GENERAL) + dst_layout = VK_IMAGE_LAYOUT_GENERAL; + else + dst_layout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
- wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, - VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(src_texture_vk->bind_mask), - VK_ACCESS_TRANSFER_READ_BIT, src_texture_vk->layout, src_layout, - src_texture_vk->image.vk_image, &vk_src_range); - wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, - VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, - vk_access_mask_from_bind_flags(dst_texture_vk->bind_mask), - VK_ACCESS_TRANSFER_WRITE_BIT, dst_texture_vk->layout, dst_layout, - dst_texture_vk->image.vk_image, &vk_dst_range); + wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, + VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, + vk_access_mask_from_bind_flags(dst_texture_vk->bind_mask), + VK_ACCESS_TRANSFER_WRITE_BIT, dst_texture_vk->layout, dst_layout, + dst_texture_vk->image.vk_image, &vk_dst_range); + }
if (resolve) { @@ -7440,16 +7460,27 @@ static DWORD vk_blitter_blit(struct wined3d_blitter *blitter, enum wined3d_blit_ dst_texture_vk->image.vk_image, dst_layout, 1, ®ion)); }
- wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, - VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, - VK_ACCESS_TRANSFER_WRITE_BIT, - vk_access_mask_from_bind_flags(dst_texture_vk->t.resource.bind_flags), - dst_layout, dst_texture_vk->layout, dst_texture_vk->image.vk_image, &vk_dst_range); - wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, - VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, - VK_ACCESS_TRANSFER_READ_BIT, - vk_access_mask_from_bind_flags(src_texture_vk->t.resource.bind_flags), - src_layout, src_texture_vk->layout, src_texture_vk->image.vk_image, &vk_src_range); + if (dst_texture->layer_count != 1 || dst_texture->level_count != 1) + { + wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, + VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, + VK_ACCESS_TRANSFER_WRITE_BIT, + vk_access_mask_from_bind_flags(dst_texture_vk->t.resource.bind_flags), + dst_layout, dst_texture_vk->layout, dst_texture_vk->image.vk_image, &vk_dst_range); + } + else + dst_texture_vk->layout = dst_layout; + + if (src_texture->layer_count != 1 || src_texture->level_count != 1) + { + wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, + VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, + VK_ACCESS_TRANSFER_READ_BIT, + vk_access_mask_from_bind_flags(src_texture_vk->t.resource.bind_flags), + src_layout, src_texture_vk->layout, src_texture_vk->image.vk_image, &vk_src_range); + } + else + src_texture_vk->layout = src_layout;
wined3d_texture_validate_location(dst_texture, dst_sub_resource_idx, WINED3D_LOCATION_TEXTURE_RGB); wined3d_texture_invalidate_location(dst_texture, dst_sub_resource_idx, ~WINED3D_LOCATION_TEXTURE_RGB); diff --git a/dlls/wined3d/utils.c b/dlls/wined3d/utils.c index 357adda5e2e..46f266ca4d2 100644 --- a/dlls/wined3d/utils.c +++ b/dlls/wined3d/utils.c @@ -4897,6 +4897,7 @@ const char *wined3d_debug_bind_flags(uint32_t bind_flags) BIND_FLAG_TO_STR(WINED3D_BIND_UNORDERED_ACCESS); BIND_FLAG_TO_STR(WINED3D_BIND_INDIRECT_BUFFER); BIND_FLAG_TO_STR(WINED3D_BIND_TRANSFER_DST); + BIND_FLAG_TO_STR(WINED3D_BIND_TRANSFER_SRC); #undef BIND_FLAG_TO_STR if (bind_flags) FIXME("Unrecognised bind flag(s) %#x.\n", bind_flags); diff --git a/dlls/wined3d/wined3d_private.h b/dlls/wined3d/wined3d_private.h index 7e0ea489ea6..d2fb3c1f03c 100644 --- a/dlls/wined3d/wined3d_private.h +++ b/dlls/wined3d/wined3d_private.h @@ -339,9 +339,11 @@ extern const struct min_lookup minMipLookup[WINED3D_TEXF_LINEAR + 1] DECLSPEC_HI extern const GLenum magLookup[WINED3D_TEXF_LINEAR + 1] DECLSPEC_HIDDEN;
#define WINED3D_BIND_TRANSFER_DST 0x10000000 +#define WINED3D_BIND_TRANSFER_SRC 0x20000000
static const uint32_t WINED3D_READ_ONLY_BIND_MASK = WINED3D_BIND_VERTEX_BUFFER | WINED3D_BIND_INDEX_BUFFER - | WINED3D_BIND_CONSTANT_BUFFER | WINED3D_BIND_SHADER_RESOURCE | WINED3D_BIND_INDIRECT_BUFFER; + | WINED3D_BIND_CONSTANT_BUFFER | WINED3D_BIND_SHADER_RESOURCE | WINED3D_BIND_INDIRECT_BUFFER + | WINED3D_BIND_TRANSFER_SRC;
static const VkAccessFlags WINED3D_READ_ONLY_ACCESS_FLAGS = VK_ACCESS_INDIRECT_COMMAND_READ_BIT | VK_ACCESS_INDEX_READ_BIT | VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT | VK_ACCESS_UNIFORM_READ_BIT
From: Stefan Dösinger stefan@codeweavers.com
--- dlls/wined3d/texture.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/dlls/wined3d/texture.c b/dlls/wined3d/texture.c index d246126eab5..4dbfb04c101 100644 --- a/dlls/wined3d/texture.c +++ b/dlls/wined3d/texture.c @@ -5091,12 +5091,19 @@ static void wined3d_texture_vk_upload_data(struct wined3d_context *context, VK_CALL(vkCmdCopyBufferToImage(vk_command_buffer, src_bo->vk_buffer, dst_texture_vk->image.vk_image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, ®ion));
- wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, - VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, - VK_ACCESS_TRANSFER_WRITE_BIT, - vk_access_mask_from_bind_flags(dst_texture_vk->t.resource.bind_flags), - VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, dst_texture_vk->layout, - dst_texture_vk->image.vk_image, &vk_range); + dst_texture_vk->bind_mask = WINED3D_BIND_TRANSFER_DST; + if (dst_texture->layer_count * dst_texture->level_count > 1) + { + wined3d_context_vk_image_barrier(context_vk, vk_command_buffer, + VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, + VK_ACCESS_TRANSFER_WRITE_BIT, + vk_access_mask_from_bind_flags(dst_texture_vk->t.resource.bind_flags), + VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, dst_texture_vk->layout, + dst_texture_vk->image.vk_image, &vk_range); + } + else + dst_texture_vk->layout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL; + wined3d_context_vk_reference_texture(context_vk, dst_texture_vk); wined3d_context_vk_reference_bo(context_vk, src_bo);
From: Stefan Dösinger stefan@codeweavers.com
FIXME: I am not sure this is correct yet... --- dlls/wined3d/texture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/dlls/wined3d/texture.c b/dlls/wined3d/texture.c index 4dbfb04c101..195b9aca5d5 100644 --- a/dlls/wined3d/texture.c +++ b/dlls/wined3d/texture.c @@ -5780,7 +5780,7 @@ void wined3d_texture_vk_barrier(struct wined3d_texture_vk *texture_vk, { src_bind_mask = texture_vk->bind_mask & WINED3D_READ_ONLY_BIND_MASK; if (!src_bind_mask) - src_bind_mask = texture_vk->bind_mask; + src_bind_mask = texture_vk->bind_mask & ~bind_mask;
texture_vk->bind_mask = bind_mask; }
Jan Sikorski (@jsikorski) commented about dlls/wined3d/texture.c:
{ src_bind_mask = texture_vk->bind_mask & WINED3D_READ_ONLY_BIND_MASK; if (!src_bind_mask)
src_bind_mask = texture_vk->bind_mask;
src_bind_mask = texture_vk->bind_mask & ~bind_mask;
That doesn't seem right, without a barrier, first write might overwrite the second, if both writes use different caches and they flush out of execution order. Is there something in the spec that says it couldn't happen?
A couple observations/questions from a partial review:
The first patch looks like a proper bug fix, maybe we want it in independently of the rest of this MR?
Regarding your question in the second patch, the answer to seems to be yes, `wined3d_rendertarget_view_vk_get_image_view` calls `wined3d_texture_vk_get_default_image_info`.
In patch 5 (and others), could we extend `wined3d_texture_vk_barrier` to handle barriers before the blits, and not worry about barriers after? Getting rid of the special case with mips/layers? (I'm not exactly sure how to nicely handle this, but currently it looks a little unwieldy. Besides, maybe we could more easily cut more `wined3d_context_vk_image_barrier` sandwitches with a generalized version.)
On Wed Mar 22 15:41:35 2023 +0000, Jan Sikorski wrote:
That doesn't seem right, without a barrier, first write might overwrite the second, if both writes use different caches and they flush out of execution order. Is there something in the spec that says it couldn't happen?
Hmm, do we need a barrier after every draw, even if nothing else is done in between? Do we need a barrier after every vkCmdCopyImage? I always had the assumption that this is not the case - barriers are needed if you switch between different operations or pipelines, but not when doing more of the same.
I'll re-read the Vulkan docs with this question in mind...
In patch 5 (and others), could we extend wined3d_texture_vk_barrier to handle barriers before the blits, and not worry about barriers after?
I guess yes, but it brings its own problems. This question is why I'd split of the blit/clear into a separate series.
I am concerned about the cost (and code required) of iterating over every subresource in wined3d_texture_vk_barrier - which is something that would be necessary if we want to make it support blits in general. The common case of the entire image being in one layout could certainly be optimized though.
We can't always transition the entire resource. A blit may happen between two subresources of the same texture.
On Wed Mar 22 17:28:23 2023 +0000, Stefan Dösinger wrote:
Hmm, do we need a barrier after every draw, even if nothing else is done in between? Do we need a barrier after every vkCmdCopyImage? I always had the assumption that this is not the case - barriers are needed if you switch between different operations or pipelines, but not when doing more of the same. I'll re-read the Vulkan docs with this question in mind...
The validation layer agrees with you (VK_LAYER_ENABLES=VK_VALIDATION_FEATURE_ENABLE_SYNCHRONIZATION_VALIDATION_EXT):
SYNC-HAZARD-WRITE-AFTER-WRITE(ERROR / SPEC): msgNum: 1544472022 - Validation Error: [ SYNC-HAZARD-WRITE-AFTER-WRITE ] Object 0: handle = 0x4306440000001c3c, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x5c0ec5d6 | vkCmdCopyImage: Hazard WRITE_AFTER_WRITE for dstImage VkImage 0x4306440000001c3c[], region 0. Access info (usage: SYNC_COPY_TRANSFER_WRITE, prior_usage: SYNC_COPY_TRANSFER_WRITE, write_barriers: 0, command: vkCmdCopyImage, seq_no: 449, reset_no: 1).
Now investigating the background behind the two consecutive copy calls I think there is some odd game behavior:
``` 08e4:trace:d3d11:d3d11_device_context_CopySubresourceRegion iface 00000000007E01D8, dst_resource 00007FF9AFE68B20, dst_subresource_idx 0, dst_x 0, dst_y 0, dst_z 0, src_resource 00007FF9AFE68B90, src_subresource_idx 0, src_box 0000000000000000. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B20, resource_dimension 0000000023E3F84C. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B90, resource_dimension 0000000023E3F84C. 08e4:trace:d3d11:d3d11_device_context_CopySubresourceRegion iface 00000000007E01D8, dst_resource 00007FF9AFE68B20, dst_subresource_idx 0, dst_x 0, dst_y 0, dst_z 0, src_resource 00007FF9AFE68B90, src_subresource_idx 0, src_box 0000000000000000. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B20, resource_dimension 0000000023E3F84C. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B90, resource_dimension 0000000023E3F84C. SYNC-HAZARD-WRITE-AFTER-WRITE(ERROR / SPEC): ... ```
It is doing the same copy twice, without any other d3d11 calls in between. The source texture might be mapped and modified in the meantime, but that would be illegal afaiu. So I am probably trying to optimize around a game bug here.
On Mon Mar 27 10:46:53 2023 +0000, Stefan Dösinger wrote:
The validation layer agrees with you (VK_LAYER_ENABLES=VK_VALIDATION_FEATURE_ENABLE_SYNCHRONIZATION_VALIDATION_EXT): SYNC-HAZARD-WRITE-AFTER-WRITE(ERROR / SPEC): msgNum: 1544472022 - Validation Error: [ SYNC-HAZARD-WRITE-AFTER-WRITE ] Object 0: handle = 0x4306440000001c3c, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x5c0ec5d6 | vkCmdCopyImage: Hazard WRITE_AFTER_WRITE for dstImage VkImage 0x4306440000001c3c[], region 0. Access info (usage: SYNC_COPY_TRANSFER_WRITE, prior_usage: SYNC_COPY_TRANSFER_WRITE, write_barriers: 0, command: vkCmdCopyImage, seq_no: 449, reset_no: 1). Now investigating the background behind the two consecutive copy calls I think there is some odd game behavior:
08e4:trace:d3d11:d3d11_device_context_CopySubresourceRegion iface 00000000007E01D8, dst_resource 00007FF9AFE68B20, dst_subresource_idx 0, dst_x 0, dst_y 0, dst_z 0, src_resource 00007FF9AFE68B90, src_subresource_idx 0, src_box 0000000000000000. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B20, resource_dimension 0000000023E3F84C. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B90, resource_dimension 0000000023E3F84C. 08e4:trace:d3d11:d3d11_device_context_CopySubresourceRegion iface 00000000007E01D8, dst_resource 00007FF9AFE68B20, dst_subresource_idx 0, dst_x 0, dst_y 0, dst_z 0, src_resource 00007FF9AFE68B90, src_subresource_idx 0, src_box 0000000000000000. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B20, resource_dimension 0000000023E3F84C. 08e4:trace:d3d11:d3d11_texture2d_GetType iface 00007FF9AFE68B90, resource_dimension 0000000023E3F84C. SYNC-HAZARD-WRITE-AFTER-WRITE(ERROR / SPEC): ...
It is doing the same copy twice, without any other d3d11 calls in between. The source texture might be mapped and modified in the meantime, but that would be illegal afaiu. So I am probably trying to optimize around a game bug here.
Oh, since it isn't 100% in visible in the d3d11 part of this log: Both CopySubresourceRegion calls write to the same area - the src box dimensions match the texture dimensions.
Regarding your question in the second patch, the answer to seems to be yes, wined3d_rendertarget_view_vk_get_image_view calls wined3d_texture_vk_get_default_image_info.
While it does so, it only grabs VkDescriptorImageInfo::imageView and ignores the layout. The other callers are either UAV-related (wined3d_unordered_access_view_vk_clear, wined3d_shader_descriptor_writes_vk_add_uav_write), for which texture_vk->layout is supposed to be LAYOUT_GENERAL anyway. Finally, wined3d_shader_descriptor_writes_vk_add_srv_write needs either GENERAL or SHADER_READ_ONLY_OPTIMAL.
Still, this is not nice. This needs a better solution.