---
I came across this in my attempts to implement nooverwrite maps based on sysmem staging buffers when buffer storage is not available. My debug modifications broke the fast path for update_sub_resource and the wined3d_resource_wait() didn't do its job for the push constant buffers.
From: Stefan Dösinger stefan@codeweavers.com
---
I came across this in my attempts to implement nooverwrite maps based on sysmem staging buffers when buffer storage is not available. My debug modifications broke the fast path for update_sub_resource and the wined3d_resource_wait() didn't do its job for the push constant buffers. --- dlls/wined3d/cs.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/dlls/wined3d/cs.c b/dlls/wined3d/cs.c index 3cfce72ae94..50d89615dd3 100644 --- a/dlls/wined3d/cs.c +++ b/dlls/wined3d/cs.c @@ -1056,6 +1056,11 @@ static void reference_graphics_pipeline_resources(struct wined3d_device_context if (state->fb.render_targets[i]) wined3d_device_context_reference_resource(context, state->fb.render_targets[i]->resource); } + for (i = 0; i < WINED3D_PUSH_CONSTANTS_COUNT; ++i) + { + if (context->device->push_constants[i]) + wined3d_device_context_reference_resource(context, &context->device->push_constants[i]->resource); + } if (state->fb.depth_stencil) wined3d_device_context_reference_resource(context, state->fb.depth_stencil->resource); reference_shader_resources(context, ~(1u << WINED3D_SHADER_TYPE_COMPUTE));
This merge request was approved by Jan Sikorski.
My understanding is that we would not normally get to wined3d_resource_wait on those buffers, do you expect that we might want to do that? Patch seems fine, but on the other hand the unneeded overhead on each draw, even if small, is maybe better avoided..
No, we shouldn't get there, ideally update_sub_resource should avoid syncing there. But for one, that's an implementation detail of USR, and it is also something that USR may change.
E.g. if the buffer is idle, USR may one day decide to copy the data directly into the destination buffer instead of a HeapAlloc'ed temp buffer. Avoiding two up-to-4k memcpys should make up for assigning values in a handful of buffers, but I haven't benchmarked this.
The reason why I ran into this is that I semi-accidentally eliminated the sync "acceleration" in map_upload_bo, thinking it was intended as a broken fallback for async maps without ARB_buffer_storage. That surprisingly broke rendering because the USR wait_resource codepath didn't work...
Semi-related to this patch, the current heap_alloc code in map_upload_bo has a few flaws and I think it should go away:
1. It HeapAlloc/HeapFree's on every invocation, and thus on every SetShaderConstant. That's worse than setting values on a few buffers. 2. It returns non-aligned buffers in case of maps - but that is easy to fix 3. It throws away existing data without the app setting DISCARD, which is a fatal flaw for maps
If we want to avoid buffer fencing in draws that should be more thorough than relying on implementation details. E.g. we should be able to get rid of most buffer fences for at least d3d10 in draws because we either draw from GPU-only resources (can't be mapped) or dynamic resources (app is responsible). Sync maps are only possible on staging resources, so we only have to fence things like CopyResource. If the d3d7-9 managed resource code is careful I think we can extend the same logic to earlier API versions. But that's beyond what I want to look into right now.
This merge request was approved by Zebediah Figura.