This greatly increases performance in "Discovery Tour by Assassin's Creed: Ancient Egypt".
The application frequently performs small (< 1 kiB) uploads to a large (128 MiB) buffer. Without this patch, we will always create and destroy a new Vulkan memory allocation, and the Vulkan driver will waste time zeroing the entire allocation.
From: Zebediah Figura zfigura@codeweavers.com
This greatly increases performance in "Discovery Tour by Assassin's Creed: Ancient Egypt".
The application frequently performs small (< 1 kiB) uploads to a large (128 MiB) buffer. Without this patch, we will always create and destroy a new Vulkan memory allocation, and the Vulkan driver will waste time zeroing the entire allocation. --- dlls/wined3d/adapter_vk.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/dlls/wined3d/adapter_vk.c b/dlls/wined3d/adapter_vk.c index 58671cfb5f9..bf93e4a1c19 100644 --- a/dlls/wined3d/adapter_vk.c +++ b/dlls/wined3d/adapter_vk.c @@ -1165,9 +1165,12 @@ void adapter_vk_copy_bo_address(struct wined3d_context *context, return; }
+ for (i = 0; i < range_count; ++i) + size = max(size, ranges[i].offset + ranges[i].size); + if (src_bo && !(src_bo->memory_type & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)) { - if (!(wined3d_context_vk_create_bo(context_vk, src_bo->size, VK_BUFFER_USAGE_TRANSFER_DST_BIT, + if (!(wined3d_context_vk_create_bo(context_vk, size, VK_BUFFER_USAGE_TRANSFER_DST_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, &staging_bo))) { ERR("Failed to create staging bo.\n"); @@ -1187,7 +1190,7 @@ void adapter_vk_copy_bo_address(struct wined3d_context *context, if (dst_bo && (!(dst_bo->memory_type & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) || (!(map_flags & WINED3D_MAP_DISCARD) && dst_bo->command_buffer_id > context_vk->completed_command_buffer_id))) { - if (!(wined3d_context_vk_create_bo(context_vk, dst_bo->size, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, + if (!(wined3d_context_vk_create_bo(context_vk, size, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, &staging_bo))) { ERR("Failed to create staging bo.\n"); @@ -1204,9 +1207,6 @@ void adapter_vk_copy_bo_address(struct wined3d_context *context, return; }
- for (i = 0; i < range_count; ++i) - size = max(size, ranges[i].offset + ranges[i].size); - src_ptr = adapter_vk_map_bo_address(context, src, size, WINED3D_MAP_READ); dst_ptr = adapter_vk_map_bo_address(context, dst, size, map_flags);
Maybe it makes sense to go one step further and only allocate a max(size - offset) buffer? If I understand correctly, as it stands, we'd still allocate 128MiB if the upload is between two large buffers and ranges only hit a small part at the end? That said, I guess it's not a thing that happens commonly, so I'm not blocking this patch.
On 9/13/22 05:16, Jan Sikorski (@jsikorski) wrote:
Maybe it makes sense to go one step further and only allocate a max(size - offset) buffer? If I understand correctly, as it stands, we'd still allocate 128MiB if the upload is between two large buffers and ranges only hit a small part at the end? That said, I guess it's not a thing that happens commonly, so I'm not blocking this patch.
Yes, there's definitely room for further improvement. It wasn't immediately clear what the best way was, though, and this was enough for now...
On 9/13/22 05:16, Jan Sikorski (@jsikorski) wrote:
Maybe it makes sense to go one step further and only allocate a max(size - offset) buffer? If I understand correctly, as it stands, we'd still allocate 128MiB if the upload is between two large buffers and ranges only hit a small part at the end? That said, I guess it's not a thing that happens commonly, so I'm not blocking this patch.
Yes, there's definitely room for further improvement. It wasn't immediately clear what the best way was, though, and this was enough for now...
This merge request was approved by Jan Sikorski.