On 5/31/21 6:43 PM, Henri Verbeet wrote:
On Mon, 31 May 2021 at 23:45, Zebediah Figura (she/her) zfigura@codeweavers.com wrote:
On 5/31/21 4:53 AM, Henri Verbeet wrote:
My (very rough) proposal for this would be to replace the "data" field in the wined3d_cs_update_sub_resource structure with a wined3d_bo_address, and then introduce something like the following:
static bool wined3d_device_context_get_upload_bo(struct
wined3d_device_context *context, struct wined3d_resource *resource, unsigned int sub_resource_idx, const struct wined3d_box *box, uint32_t flags, struct wined3d_bo_address *address, void **map_ptr) { ... }
This function strikes me as more than a little confusing, party because it seems to be returning two partially overlapping things, with mostly different usages...
i.e. I'd presume the idea is for wined3d_device_context_update_sub_resource() to retrieve this and then call something like wined3d_context_copy_bo_address() to copy from sysmem, which makes sense, but then what are we doing with map_ptr?
Well, wined3d_context_copy_bo_address() needs a wined3d_context, which you typically don't have on application threads.
The basic idea is that you'd pass "address" to WINED3D_CS_OP_UPDATE_SUB_RESOURCE, and "map_ptr " to the application. (In case of map/unmap; for update_sub_resource() you'd just copy the data into "map_ptr".)
Or, if we use this in wined3d_device_context_map(), couldn't we just call wined3d_context_map_bo_address() on the returned wined3d_bo_address?
Conceptually, sure, you could split this into allocate/map/unmap. In practice though, you'll always want the upload buffer to be mapped, in order to copy data into it from system memory. And in fact, if we're going to return a GPU buffer here, it should generally already be mapped. (For various reasons; for 64-bit applications, we use persistent maps if we can; for OpenGL, we want to avoid making GL calls from application threads; if we have a dedicated upload buffer belonging to the device context, it would have been mapped during creation.) When needed, unmap can happen when the upload buffer is consumed.
Thanks for explaining the rationale there, that does help a lot.
It still feels awkward to me to pass the same information maybe twice, but I guess the wined3d_bo_address part should basically be considered opaque above the CS thread.
For deferred contexts you should then be able to do something (conceptually) very similar to patch 2/2, and in the end we may not even need map/unmap/update_sub_resource in the wined3d_device_context_ops structure.
We'd still presumably need an unmap, though, unless the idea above is to *always* allocate new (GPU or CPU) memory, which seems like a much bigger step.
Not necessarily, but wined3d_cs_exec_update_sub_resource() can unmap if needed.
Unless I'm misreading things, this also seems to imply that *_resource_sub_resource_map() shouldn't exist anymore, right?
Non-discard maps still exist, and wined3d_device_context_get_upload_bo() wouldn't be used for those.
So the idea goes like this:
(1) Pass a whole wined3d_bo_address as part of struct wined3d_cs_update_sub_resource.
(2) Introduce wined3d_device_context_get_upload_bo() to allocate memory. At first, only use it for deferred contexts.
(3) Use wined3d_device_context_get_upload_bo() for immediate contexts as well, for both DISCARD and NOOVERWRITE maps and for update_sub_resource. This implies always allocating new CPU/GPU memory, going through the DEFAULT queue, and not waiting for the upload to complete.
The idea that map/unmap/update_sub_resource can go away seems to imply (3) to me, but I'm more than a little curious as to whether it's actually desirable in all cases. Even with persistent mapping you still need to allocate new memory, which isn't exactly cheap, especially with large resources.
And then there's the aforementioned non-discard maps—don't we still need an explicit WINED3D_CS_OP_UNMAP for those? Or any case (are there any?) where we don't want to persistently map a resource?