From: Zebediah Figura zfigura@codeweavers.com
d3d maps of such resources will map the CPU copy, and uploads and downloads use glBufferSubData() and glGetBufferSubData() respectively. There is no need to map the BO.
This improves performance of Indivisible on NVidia GPUs. The game uses a d3d9 MANAGED buffer for streaming vertex data, which results in poor performance on NVidia when using GL_MAP_READ_BIT | GL_MAP_PERSISTENT_BIT. With this change we use neither. --- dlls/wined3d/resource.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/dlls/wined3d/resource.c b/dlls/wined3d/resource.c index 08edce3630c..c32c9580ae9 100644 --- a/dlls/wined3d/resource.c +++ b/dlls/wined3d/resource.c @@ -379,10 +379,13 @@ GLbitfield wined3d_resource_gl_storage_flags(const struct wined3d_resource *reso
if (resource->usage & WINED3DUSAGE_DYNAMIC) flags |= GL_CLIENT_STORAGE_BIT; - if (access & WINED3D_RESOURCE_ACCESS_MAP_W) - flags |= GL_MAP_WRITE_BIT; - if (access & WINED3D_RESOURCE_ACCESS_MAP_R) - flags |= GL_MAP_READ_BIT; + if (!(access & WINED3D_RESOURCE_ACCESS_CPU)) + { + if (access & WINED3D_RESOURCE_ACCESS_MAP_W) + flags |= GL_MAP_WRITE_BIT; + if (access & WINED3D_RESOURCE_ACCESS_MAP_R) + flags |= GL_MAP_READ_BIT; + }
return flags; }
From: Zebediah Figura zfigura@codeweavers.com
d3d maps of such resources will map the CPU copy, and uploads and downloads can use a staging BO.
This is the Vulkan equivalent of the previous commit, but has not been tested with Indivisible, since the latter is a d3d9 game, and the Vulkan backend currently does not support d3d9. --- dlls/wined3d/buffer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/dlls/wined3d/buffer.c b/dlls/wined3d/buffer.c index 3a2263d109f..5158511b05f 100644 --- a/dlls/wined3d/buffer.c +++ b/dlls/wined3d/buffer.c @@ -1468,9 +1468,9 @@ VkMemoryPropertyFlags vk_memory_type_from_access_flags(uint32_t access, uint32_t { VkMemoryPropertyFlags memory_type = 0;
- if (access & WINED3D_RESOURCE_ACCESS_MAP_R) + if (!(access & WINED3D_RESOURCE_ACCESS_CPU) && (access & WINED3D_RESOURCE_ACCESS_MAP_R)) memory_type |= VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_CACHED_BIT; - else if (access & WINED3D_RESOURCE_ACCESS_MAP_W) + else if (!(access & WINED3D_RESOURCE_ACCESS_CPU) && (access & WINED3D_RESOURCE_ACCESS_MAP_W)) memory_type |= VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT; else if (!(usage & WINED3DUSAGE_DYNAMIC)) memory_type |= VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT;
d3d maps of such resources will map the CPU copy, and uploads and downloads can
use a staging BO.
I'm concerned about performance implications of this change. If we use a staging (temporary) BO for downloads, then we're going to stall waiting for the download to complete in `wined3d_texture_vk_download_data`. What's the story here for D3D11_USAGE_STAGING resources?
I understand there's some reason to do this for the GL backend, what motivation there is for the Vulkan change?
On 10/5/22 09:13, Jan Sikorski (@jsikorski) wrote:
d3d maps of such resources will map the CPU copy, and uploads and downloads can
use a staging BO.
I'm concerned about performance implications of this change. If we use a staging (temporary) BO for downloads, then we're going to stall waiting for the download to complete in `wined3d_texture_vk_download_data`. What's the story here for D3D11_USAGE_STAGING resources?
I understand there's some reason to do this for the GL backend, what motivation there is for the Vulkan change?
Sadly I did not think this through very completely :-/
I wrote the change for the GL backend, tacitly assumed that it should apply to Vulkan as well, checked that it worked functionally, and then submitted the patch.
IIRC I *did* discuss this with an NVidia driver developer, and the comment I got was that MAP_READ_BIT | MAP_WRITE_BIT would return a memory type corresponding to Vulkan's HOST_CACHED, so if the managed BO performs badly with MAP_READ_BIT | MAP_WRITE_BIT then I'd expect it to perform badly with the current Vulkan backend as well.
But there's no way of testing, and it's not clear that this *is* an improvement, so I think it makes the most sense just to leave out this patch for now. I'll resend with just 1/2.