Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=46068
The change introduced by commit 54a9e84952ce8f566f79ab92930faf84bad0cfc1 had side effect of ultimately selecting detsination texture location as dst_texture->resource.map_binding, which is not the case in the default texture2d_blt() code path.
Signed-off-by: Paul Gofman gofmanp@gmail.com --- dlls/wined3d/surface.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/dlls/wined3d/surface.c b/dlls/wined3d/surface.c index 5f5c53a8b8..18542e1b74 100644 --- a/dlls/wined3d/surface.c +++ b/dlls/wined3d/surface.c @@ -3275,7 +3275,8 @@ HRESULT texture2d_blt(struct wined3d_texture *dst_texture, unsigned int dst_sub_ } } else if (!(src_sub_resource->locations & surface_simple_locations) - && (dst_sub_resource->locations & dst_texture->resource.map_binding)) + && (dst_sub_resource->locations & dst_texture->resource.map_binding) + && !(dst_texture->resource.access & WINED3D_RESOURCE_ACCESS_GPU)) { /* Download */ if (scale)
On Tue, 20 Nov 2018 at 15:51, Paul Gofman gofmanp@gmail.com wrote:
The change introduced by commit 54a9e84952ce8f566f79ab92930faf84bad0cfc1 had side effect of ultimately selecting detsination texture location as dst_texture->resource.map_binding, which is not the case in the default texture2d_blt() code path.
I think this change probably makes sense, but I'm curious about the details. Is the destination texture a mappable rendertarget or offscreen-plain surface? It would need to be mappable for this to be an issue, but system memory surfaces wouldn't have WINED3D_RESOURCE_ACCESS_GPU. Which API is the application calling? StretchRect()?
On 11/20/18 16:59, Henri Verbeet wrote:
On Tue, 20 Nov 2018 at 15:51, Paul Gofman gofmanp@gmail.com wrote: I think this change probably makes sense, but I'm curious about the details. Is the destination texture a mappable rendertarget or offscreen-plain surface? It would need to be mappable for this to be an issue, but system memory surfaces wouldn't have WINED3D_RESOURCE_ACCESS_GPU.
It is render target:
-- snip --
009f:trace:d3d9:d3d9_device_CreateTexture iface 0x166f48, width 1920, height 1065, levels 1, usage 0x1, format 0x15, pool 0, texture 0xa66e1c0, shared_handle (nil). 009f:trace:d3d:wined3d_texture_create device 0x17d1d8, desc 0xa66e084, layer_count 1, level_count 1, flags 0x8, data (nil), parent 0x5f8e00b0, parent_ops 0x7ed0cd40, texture 0x5f8e00c0. 009f:trace:d3d:wined3d_texture_gl_init texture_gl 0x5f8e00e8, device 0x17d1d8, desc 0xa66e084, layer_count 1, level_count 1, flags 0x8, parent 0x5f8e00b0, parent_ops 0x7ed0cd40, sub_resources 0x5f8e02c0. 009f:trace:d3d:wined3d_texture_init texture 0x5f8e00e8, resource_type WINED3D_RTYPE_TEXTURE_2D, format WINED3DFMT_B8G8R8A8_UNORM, multisample_type 0, multisample_quality 0, usage 0, access WINED3D_RESOURCE_ACCESS_GPU | WINED3D_RESOURCE_ACCESS_MAP_R | WINED3D_RESOURCE_ACCESS_MAP_W, width 1920, height 1065, depth 1, layer_count 1, level_count 1, flags 0x8, device 0x17d1d8, parent 0x5f8e00b0, parent_ops 0x7ed0cd40, sub_resources 0x5f8e02c0, texture_ops 0x7ecc5fb8. --- snip ---
It gets WINED3D_LOCATION_SYSMEM right at initialiaztion and this location is never used again anywhere but in the fast texture load code path under consideration (which ultimately prefers the sysmem location). Application never tries to map it.
Which API is the application calling? StretchRect()?
Yes, it is StretchRect().
On Tue, 20 Nov 2018 at 17:39, Paul Gofman gofmanp@gmail.com wrote:
It gets WINED3D_LOCATION_SYSMEM right at initialiaztion and this location is never used again anywhere but in the fast texture load code path under consideration (which ultimately prefers the sysmem location). Application never tries to map it.
Ah yes, that would do it. I think ideally we'd fix that as well. There's a somewhat theoretical concern that for e.g. a mappable rendertarget or default pool offscreen plain surface it may be advantageous to use the download path as well. I'm not sure how much we care about those cases.
On 11/20/18 17:38, Henri Verbeet wrote:
On Tue, 20 Nov 2018 at 17:39, Paul Gofman gofmanp@gmail.com wrote:
It gets WINED3D_LOCATION_SYSMEM right at initialiaztion and this location is never used again anywhere but in the fast texture load code path under consideration (which ultimately prefers the sysmem location). Application never tries to map it.
Ah yes, that would do it. I think ideally we'd fix that as well. There's a somewhat theoretical concern that for e.g. a mappable rendertarget or default pool offscreen plain surface it may be advantageous to use the download path as well. I'm not sure how much we care about those cases.
I am not sure I fully understand. The case was that mappable render target was taking the download path, and that was causing performance regression due to downloading texture to system memory while it was not actually used there. Should I maybe use texture download_count to refine the condition of taking download path? The thing is though download_count is not currently incremented in texture2d_load_location() (only in texture1d_load_location() and texture3d_load_location(). This should probably be changed too in this case?
On Tue, 20 Nov 2018 at 18:28, Paul Gofman gofmanp@gmail.com wrote:
On 11/20/18 17:38, Henri Verbeet wrote:
Ah yes, that would do it. I think ideally we'd fix that as well. There's a somewhat theoretical concern that for e.g. a mappable rendertarget or default pool offscreen plain surface it may be advantageous to use the download path as well. I'm not sure how much we care about those cases.
I am not sure I fully understand. The case was that mappable render target was taking the download path, and that was causing performance regression due to downloading texture to system memory while it was not actually used there. Should I maybe use texture download_count to refine the condition of taking download path? The thing is though download_count is not currently incremented in texture2d_load_location() (only in texture1d_load_location() and texture3d_load_location(). This should probably be changed too in this case?
I think the patch is fine the way it is, I'm running the tests right now.
The case I was thinking of above is when the map binding location is current because the application is actually mapping the texture, rather than because it ends up being the same as the default location. We can't really distinguish those two cases right now.
On 11/20/18 18:08, Henri Verbeet wrote:
The case I was thinking of above is when the map binding location is current because the application is actually mapping the texture, rather than because it ends up being the same as the default location. We can't really distinguish those two cases right now.
I thought there is download_count texture variable meant for this purpose. It currently makes wined3d_texture_evict_sysmem() do nothing if the application effectively requested to download the texture to sysmem over 50 times. This counter could help to sort out default location case or to ignore just a few application requested mappings during some initialization. Still I suppose more accurate statistics might be ideally required for texture blit. E. g., what if application mapped and downloaded the texture 51 times over the run time for some specific purposes, but other 10000 times did StretchBlt without mapping anything? Maybe just count of total texture blits in addition to download count could help?
On Tue, 20 Nov 2018 at 18:57, Paul Gofman gofmanp@gmail.com wrote:
On 11/20/18 18:08, Henri Verbeet wrote:
The case I was thinking of above is when the map binding location is current because the application is actually mapping the texture, rather than because it ends up being the same as the default location. We can't really distinguish those two cases right now.
I thought there is download_count texture variable meant for this purpose. It currently makes wined3d_texture_evict_sysmem() do nothing if the application effectively requested to download the texture to sysmem over 50 times. This counter could help to sort out default location case or to ignore just a few application requested mappings during some initialization. Still I suppose more accurate statistics might be ideally required for texture blit. E. g., what if application mapped and downloaded the texture 51 times over the run time for some specific purposes, but other 10000 times did StretchBlt without mapping anything? Maybe just count of total texture blits in addition to download count could help?
Very generically, yes, more accurate statistics would allow us to make better decisions. There's a bigger picture as well though. Transferring resources between the GPU and CPU is really only supposed to be a consideration for D3DPOOL_MANAGED resources. At the same time, for d3d9, the general advice is that if you're doing anything remotely performance sensitive, it's best to avoid managed resources. D3DPOOL_SYSTEMMEM resources should stay on the CPU, D3DPOOL_DEFAULT resources should stay on the GPU. There are a few exceptions to that, mainly dynamic resources, lockable render targets, and offscreen plain surfaces. For lockable render targets, mapping them is expected to be slow; I'm not sure we tested it, but the documentation seems to suggest they're downloaded from the GPU each time they're locked. Dynamic resources are effectively read-only for the GPU. I.e., D3DUSAGE_DYNAMIC is mutually exclusive with D3DUSAGE_RENDERTARGET or D3DUSAGE_DEPTHSTENCIL. (Note also that StrectRect() doesn't allow blitting to regular textures.) While it may be possible for the CPU to read from dynamic resources, there should never be a reason to, and doing so is not expected to be fast. And then there's offscreen plain surfaces. Those can't really be used as texture or rendertarget either, but they can be used as StretchRect() destination when blitting from another offscreen plain surface.