[PATCH 1/3] wined3d: Use wined3d_buffer_load_location() in wined3d_buffer_get_memory().
Signed-off-by: Zebediah Figura <zfigura(a)codeweavers.com> --- dlls/wined3d/buffer.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/dlls/wined3d/buffer.c b/dlls/wined3d/buffer.c index b42c7bd919a..3d66af3e5e6 100644 --- a/dlls/wined3d/buffer.c +++ b/dlls/wined3d/buffer.c @@ -653,14 +653,12 @@ DWORD wined3d_buffer_get_memory(struct wined3d_buffer *buffer, struct wined3d_co if (locations & WINED3D_LOCATION_DISCARDED) { locations = ((buffer->flags & WINED3D_BUFFER_USE_BO) ? WINED3D_LOCATION_BUFFER : WINED3D_LOCATION_SYSMEM); - if (!wined3d_buffer_prepare_location(buffer, context, locations)) + if (!wined3d_buffer_load_location(buffer, context, locations)) { data->buffer_object = 0; data->addr = NULL; return 0; } - wined3d_buffer_validate_location(buffer, locations); - wined3d_buffer_invalidate_location(buffer, WINED3D_LOCATION_DISCARDED); } if (locations & WINED3D_LOCATION_BUFFER) { -- 2.34.1
Instead of checking whether a BO already exists. This will end up allocating a BO earlier in some cases. This is not particularly impactful by itself, since we already would have sysmem available and thus could use it without a performance penalty. However, we would like to avoid ever allocating sysmem where not necessary, in particular by deferring allocation of any location at all until the resource is written to. This also has the side effect of fixing test_map_synchronization() on 64-bit architectures, broken since 194b47b4fd92dda8ebf24e88ca7a14fc926c84ab. The test creates a buffer, maps it once, then maps it again with NOOVERWRITE while the GPU is still drawing, expecting the new data to be read by the GPU during the draw. On 32-bit machines, and 64-bit machines before the offending commit, we do the following: First map: uses SYSMEM since the BO is not created yet Draw: upload to VBO Second map: map the existing VBO with GL_MAP_UNSYNCHRONIZED_BIT After 194b47b4fd9, we don't use GL_MAP_UNSYNCHRONIZED_BIT since the buffer has READ access, which means that the second map will be synchronized and wait for the draw to complete. After this patch, we do the following: First map: create and map a VBO (not unsynchronized, but coherent and persistently mapped) Draw: use mapped VBO Second map: write to existing (coherent) VBO, which is unsynchronized Signed-off-by: Zebediah Figura <zfigura(a)codeweavers.com> --- This diff is distinctly more readable with --ignore-all-space. dlls/wined3d/buffer.c | 139 ++++++++++++++++++++++-------------------- 1 file changed, 72 insertions(+), 67 deletions(-) diff --git a/dlls/wined3d/buffer.c b/dlls/wined3d/buffer.c index 3d66af3e5e6..abbef0fac41 100644 --- a/dlls/wined3d/buffer.c +++ b/dlls/wined3d/buffer.c @@ -911,105 +911,110 @@ static HRESULT buffer_resource_sub_resource_map(struct wined3d_resource *resourc void **map_ptr, const struct wined3d_box *box, uint32_t flags) { struct wined3d_buffer *buffer = buffer_from_resource(resource); + unsigned int offset, size, dirty_offset, dirty_size; struct wined3d_device *device = resource->device; struct wined3d_context *context; - unsigned int offset, size; + struct wined3d_bo_address addr; uint8_t *base; LONG count; TRACE("resource %p, sub_resource_idx %u, map_ptr %p, box %s, flags %#x.\n", resource, sub_resource_idx, map_ptr, debug_box(box), flags); - offset = box->left; - size = box->right - box->left; + dirty_offset = offset = box->left; + dirty_size = size = box->right - box->left; count = ++resource->map_count; - if (buffer->buffer_object) + /* DISCARD invalidates the entire buffer, regardless of the specified + * offset and size. Some applications also depend on the entire buffer + * being uploaded in that case. Two such applications are Port Royale + * and Darkstar One. */ + if (flags & WINED3D_MAP_DISCARD) { - unsigned int dirty_offset = offset, dirty_size = size; - struct wined3d_bo_address addr; + dirty_offset = 0; + dirty_size = 0; + } - /* DISCARD invalidates the entire buffer, regardless of the specified - * offset and size. Some applications also depend on the entire buffer - * being uploaded in that case. Two such applications are Port Royale - * and Darkstar One. */ - if (flags & WINED3D_MAP_DISCARD) + if (((flags & WINED3D_MAP_WRITE) && !(flags & (WINED3D_MAP_NOOVERWRITE | WINED3D_MAP_DISCARD))) + || (!(flags & WINED3D_MAP_WRITE) && (buffer->locations & WINED3D_LOCATION_SYSMEM)) + || buffer->flags & WINED3D_BUFFER_PIN_SYSMEM + || !(buffer->flags & WINED3D_BUFFER_USE_BO)) + { + if (!(buffer->locations & WINED3D_LOCATION_SYSMEM)) { - dirty_offset = 0; - dirty_size = 0; + context = context_acquire(device, NULL, 0); + wined3d_buffer_load_location(buffer, context, WINED3D_LOCATION_SYSMEM); + context_release(context); } - if (((flags & WINED3D_MAP_WRITE) && !(flags & (WINED3D_MAP_NOOVERWRITE | WINED3D_MAP_DISCARD))) - || (!(flags & WINED3D_MAP_WRITE) && (buffer->locations & WINED3D_LOCATION_SYSMEM)) - || buffer->flags & WINED3D_BUFFER_PIN_SYSMEM) + if (flags & WINED3D_MAP_WRITE) + wined3d_buffer_invalidate_range(buffer, WINED3D_LOCATION_BUFFER, dirty_offset, dirty_size); + } + else + { + context = context_acquire(device, NULL, 0); + + if (flags & WINED3D_MAP_DISCARD) { - if (!(buffer->locations & WINED3D_LOCATION_SYSMEM)) + if (!wined3d_buffer_prepare_location(buffer, context, WINED3D_LOCATION_BUFFER)) { - context = context_acquire(device, NULL, 0); - wined3d_buffer_load_location(buffer, context, WINED3D_LOCATION_SYSMEM); context_release(context); + return E_OUTOFMEMORY; } - - if (flags & WINED3D_MAP_WRITE) - wined3d_buffer_invalidate_range(buffer, WINED3D_LOCATION_BUFFER, dirty_offset, dirty_size); + wined3d_buffer_validate_location(buffer, WINED3D_LOCATION_BUFFER); } else { - context = context_acquire(device, NULL, 0); + wined3d_buffer_load_location(buffer, context, WINED3D_LOCATION_BUFFER); + } - if (flags & WINED3D_MAP_DISCARD) - wined3d_buffer_validate_location(buffer, WINED3D_LOCATION_BUFFER); - else - wined3d_buffer_load_location(buffer, context, WINED3D_LOCATION_BUFFER); + if (flags & WINED3D_MAP_WRITE) + { + wined3d_buffer_invalidate_location(buffer, WINED3D_LOCATION_SYSMEM); + buffer_invalidate_bo_range(buffer, dirty_offset, dirty_size); + } - if (flags & WINED3D_MAP_WRITE) + if ((flags & WINED3D_MAP_DISCARD) && resource->heap_memory) + wined3d_buffer_evict_sysmem(buffer); + + if (count == 1) + { + addr.buffer_object = buffer->buffer_object; + addr.addr = 0; + buffer->map_ptr = wined3d_context_map_bo_address(context, &addr, resource->size, flags); + /* We are accessing buffer->resource.client from the CS thread, + * but it's safe because the client thread will wait for the + * map to return, thus completely serializing this call with + * other client code. */ + buffer->resource.client.addr = addr; + + if (((DWORD_PTR)buffer->map_ptr) & (RESOURCE_ALIGNMENT - 1)) { - wined3d_buffer_invalidate_location(buffer, WINED3D_LOCATION_SYSMEM); - buffer_invalidate_bo_range(buffer, dirty_offset, dirty_size); - } + WARN("Pointer %p is not %u byte aligned.\n", buffer->map_ptr, RESOURCE_ALIGNMENT); - if ((flags & WINED3D_MAP_DISCARD) && resource->heap_memory) - wined3d_buffer_evict_sysmem(buffer); + wined3d_context_unmap_bo_address(context, &addr, 0, NULL); + buffer->map_ptr = NULL; - if (count == 1) - { - addr.buffer_object = buffer->buffer_object; - addr.addr = 0; - buffer->map_ptr = wined3d_context_map_bo_address(context, &addr, resource->size, flags); - /* We are accessing buffer->resource.client from the CS thread, - * but it's safe because the client thread will wait for the - * map to return, thus completely serializing this call with - * other client code. */ - buffer->resource.client.addr = addr; - - if (((DWORD_PTR)buffer->map_ptr) & (RESOURCE_ALIGNMENT - 1)) + if (resource->usage & WINED3DUSAGE_DYNAMIC) + { + /* The extra copy is more expensive than not using VBOs + * at all on the NVIDIA Linux driver, which is the + * only driver that returns unaligned pointers. */ + TRACE("Dynamic buffer, dropping VBO.\n"); + wined3d_buffer_drop_bo(buffer); + } + else { - WARN("Pointer %p is not %u byte aligned.\n", buffer->map_ptr, RESOURCE_ALIGNMENT); - - wined3d_context_unmap_bo_address(context, &addr, 0, NULL); - buffer->map_ptr = NULL; - - if (resource->usage & WINED3DUSAGE_DYNAMIC) - { - /* The extra copy is more expensive than not using VBOs - * at all on the NVIDIA Linux driver, which is the - * only driver that returns unaligned pointers. */ - TRACE("Dynamic buffer, dropping VBO.\n"); - wined3d_buffer_drop_bo(buffer); - } - else - { - TRACE("Falling back to doublebuffered operation.\n"); - wined3d_buffer_load_location(buffer, context, WINED3D_LOCATION_SYSMEM); - buffer->flags |= WINED3D_BUFFER_PIN_SYSMEM; - } - TRACE("New pointer is %p.\n", resource->heap_memory); + TRACE("Falling back to doublebuffered operation.\n"); + wined3d_buffer_load_location(buffer, context, WINED3D_LOCATION_SYSMEM); + buffer->flags |= WINED3D_BUFFER_PIN_SYSMEM; } + TRACE("New pointer is %p.\n", resource->heap_memory); } - - context_release(context); } + + context_release(context); } base = buffer->map_ptr ? buffer->map_ptr : resource->heap_memory; -- 2.34.1
Signed-off-by: Henri Verbeet <hverbeet(a)codeweavers.com>
The practical effect of this is to defer clearing buffers until they are used. Under normal circumstances the buffer will be initially discarded, in which case we need not clear it at all, and may even avoid ever allocating sysmem. Signed-off-by: Zebediah Figura <zfigura(a)codeweavers.com> --- dlls/wined3d/buffer.c | 31 +++++++++++++++++++++++-------- dlls/wined3d/utils.c | 1 + dlls/wined3d/wined3d_private.h | 1 + 3 files changed, 25 insertions(+), 8 deletions(-) diff --git a/dlls/wined3d/buffer.c b/dlls/wined3d/buffer.c index abbef0fac41..2f3b5fd316e 100644 --- a/dlls/wined3d/buffer.c +++ b/dlls/wined3d/buffer.c @@ -607,13 +607,28 @@ BOOL wined3d_buffer_load_location(struct wined3d_buffer *buffer, switch (location) { case WINED3D_LOCATION_SYSMEM: - range.offset = 0; - range.size = buffer->resource.size; - buffer->buffer_ops->buffer_download_ranges(buffer, context, - buffer->resource.heap_memory, 0, 1, &range); + if (buffer->locations & WINED3D_LOCATION_CLEARED) + { + memset(buffer->resource.heap_memory, 0, buffer->resource.size); + } + else + { + range.offset = 0; + range.size = buffer->resource.size; + buffer->buffer_ops->buffer_download_ranges(buffer, context, + buffer->resource.heap_memory, 0, 1, &range); + } break; case WINED3D_LOCATION_BUFFER: + if (buffer->locations & WINED3D_LOCATION_CLEARED) + { + /* FIXME: Clear the buffer on the GPU if possible. */ + if (!wined3d_buffer_prepare_location(buffer, context, WINED3D_LOCATION_SYSMEM)) + return FALSE; + memset(buffer->resource.heap_memory, 0, buffer->resource.size); + } + if (!buffer->conversion_map) buffer->buffer_ops->buffer_upload_ranges(buffer, context, buffer->resource.heap_memory, 0, buffer->modified_areas, buffer->maps); @@ -650,7 +665,7 @@ DWORD wined3d_buffer_get_memory(struct wined3d_buffer *buffer, struct wined3d_co TRACE("buffer %p, context %p, data %p, locations %s.\n", buffer, context, data, wined3d_debug_location(locations)); - if (locations & WINED3D_LOCATION_DISCARDED) + if (locations & (WINED3D_LOCATION_DISCARDED | WINED3D_LOCATION_CLEARED)) { locations = ((buffer->flags & WINED3D_BUFFER_USE_BO) ? WINED3D_LOCATION_BUFFER : WINED3D_LOCATION_SYSMEM); if (!wined3d_buffer_load_location(buffer, context, locations)) @@ -949,7 +964,7 @@ static HRESULT buffer_resource_sub_resource_map(struct wined3d_resource *resourc } if (flags & WINED3D_MAP_WRITE) - wined3d_buffer_invalidate_range(buffer, WINED3D_LOCATION_BUFFER, dirty_offset, dirty_size); + wined3d_buffer_invalidate_range(buffer, ~WINED3D_LOCATION_SYSMEM, dirty_offset, dirty_size); } else { @@ -971,7 +986,7 @@ static HRESULT buffer_resource_sub_resource_map(struct wined3d_resource *resourc if (flags & WINED3D_MAP_WRITE) { - wined3d_buffer_invalidate_location(buffer, WINED3D_LOCATION_SYSMEM); + wined3d_buffer_invalidate_location(buffer, ~WINED3D_LOCATION_BUFFER); buffer_invalidate_bo_range(buffer, dirty_offset, dirty_size); } @@ -1241,7 +1256,7 @@ static HRESULT wined3d_buffer_init(struct wined3d_buffer *buffer, struct wined3d } buffer->buffer_ops = buffer_ops; buffer->structure_byte_stride = desc->structure_byte_stride; - buffer->locations = data ? WINED3D_LOCATION_DISCARDED : WINED3D_LOCATION_SYSMEM; + buffer->locations = WINED3D_LOCATION_CLEARED; TRACE("buffer %p, size %#x, usage %#x, memory @ %p.\n", buffer, buffer->resource.size, buffer->resource.usage, buffer->resource.heap_memory); diff --git a/dlls/wined3d/utils.c b/dlls/wined3d/utils.c index e50d42ead92..bf9ae11ea13 100644 --- a/dlls/wined3d/utils.c +++ b/dlls/wined3d/utils.c @@ -6864,6 +6864,7 @@ const char *wined3d_debug_location(DWORD location) #define LOCATION_TO_STR(x) if (location & x) { debug_append(&buffer, #x, " | "); location &= ~x; } LOCATION_TO_STR(WINED3D_LOCATION_DISCARDED); LOCATION_TO_STR(WINED3D_LOCATION_SYSMEM); + LOCATION_TO_STR(WINED3D_LOCATION_CLEARED); LOCATION_TO_STR(WINED3D_LOCATION_BUFFER); LOCATION_TO_STR(WINED3D_LOCATION_TEXTURE_RGB); LOCATION_TO_STR(WINED3D_LOCATION_TEXTURE_SRGB); diff --git a/dlls/wined3d/wined3d_private.h b/dlls/wined3d/wined3d_private.h index e28fc3b880d..1b3a13024f9 100644 --- a/dlls/wined3d/wined3d_private.h +++ b/dlls/wined3d/wined3d_private.h @@ -4316,6 +4316,7 @@ void wined3d_resource_update_draw_binding(struct wined3d_resource *resource) DEC #define WINED3D_LOCATION_DISCARDED 0x00000001 #define WINED3D_LOCATION_SYSMEM 0x00000002 +#define WINED3D_LOCATION_CLEARED 0x00000004 #define WINED3D_LOCATION_BUFFER 0x00000008 #define WINED3D_LOCATION_TEXTURE_RGB 0x00000010 #define WINED3D_LOCATION_TEXTURE_SRGB 0x00000020 -- 2.34.1
Signed-off-by: Henri Verbeet <hverbeet(a)codeweavers.com>
On Fri, 28 Jan 2022 at 03:23, Zebediah Figura <zfigura(a)codeweavers.com> wrote:
@@ -653,14 +653,12 @@ DWORD wined3d_buffer_get_memory(struct wined3d_buffer *buffer, struct wined3d_co if (locations & WINED3D_LOCATION_DISCARDED) { locations = ((buffer->flags & WINED3D_BUFFER_USE_BO) ? WINED3D_LOCATION_BUFFER : WINED3D_LOCATION_SYSMEM); - if (!wined3d_buffer_prepare_location(buffer, context, locations)) + if (!wined3d_buffer_load_location(buffer, context, locations)) { data->buffer_object = 0; data->addr = NULL; return 0; } - wined3d_buffer_validate_location(buffer, locations); - wined3d_buffer_invalidate_location(buffer, WINED3D_LOCATION_DISCARDED); } Where would we load the contents from if the buffer has been discarded?
On Fri, 28 Jan 2022 at 12:08, Henri Verbeet <hverbeet(a)gmail.com> wrote:
On Fri, 28 Jan 2022 at 03:23, Zebediah Figura <zfigura(a)codeweavers.com> wrote:
@@ -653,14 +653,12 @@ DWORD wined3d_buffer_get_memory(struct wined3d_buffer *buffer, struct wined3d_co if (locations & WINED3D_LOCATION_DISCARDED) { locations = ((buffer->flags & WINED3D_BUFFER_USE_BO) ? WINED3D_LOCATION_BUFFER : WINED3D_LOCATION_SYSMEM); - if (!wined3d_buffer_prepare_location(buffer, context, locations)) + if (!wined3d_buffer_load_location(buffer, context, locations)) { data->buffer_object = 0; data->addr = NULL; return 0; } - wined3d_buffer_validate_location(buffer, locations); - wined3d_buffer_invalidate_location(buffer, WINED3D_LOCATION_DISCARDED); } Where would we load the contents from if the buffer has been discarded?
Or well, I suppose the point of the patch is that wined3d_buffer_load_location() will essentially do the same thing as the code it replaces in this case; I guess that makes sense.
On 1/28/22 05:12, Henri Verbeet wrote:
On Fri, 28 Jan 2022 at 12:08, Henri Verbeet <hverbeet(a)gmail.com> wrote:
On Fri, 28 Jan 2022 at 03:23, Zebediah Figura <zfigura(a)codeweavers.com> wrote:
@@ -653,14 +653,12 @@ DWORD wined3d_buffer_get_memory(struct wined3d_buffer *buffer, struct wined3d_co if (locations & WINED3D_LOCATION_DISCARDED) { locations = ((buffer->flags & WINED3D_BUFFER_USE_BO) ? WINED3D_LOCATION_BUFFER : WINED3D_LOCATION_SYSMEM); - if (!wined3d_buffer_prepare_location(buffer, context, locations)) + if (!wined3d_buffer_load_location(buffer, context, locations)) { data->buffer_object = 0; data->addr = NULL; return 0; } - wined3d_buffer_validate_location(buffer, locations); - wined3d_buffer_invalidate_location(buffer, WINED3D_LOCATION_DISCARDED); } Where would we load the contents from if the buffer has been discarded?
Or well, I suppose the point of the patch is that wined3d_buffer_load_location() will essentially do the same thing as the code it replaces in this case; I guess that makes sense.
Yes. I suppose it looks odd by itself, but I started writing the case for WINED3D_LOCATION_CLEARED here and then realized that I was open-coding wined3d_buffer_load_location().
participants (4)
-
Henri Verbeet -
Henri Verbeet -
Zebediah Figura -
Zebediah Figura (she/her)