https://bugs.winehq.org/show_bug.cgi?id=45901
--- Comment #4 from Andrew Wesie awesie@gmail.com --- It looks like Overwatch has a 1D render target texture that it copies to a 1D staging texture. It has a rotation of 5 destination textures, likely to prevent blocking the GPU. In the excerpts below, I only include one destination texture.
Source texture init:
d3d11_device_CreateTexture1D iface 0xc1dc0, desc 0xc46e560, data (nil), texture 0x7f4fbfe07ce8 wined3d_texture_init texture 0x61ef5400, resource_type WINED3D_RTYPE_TEXTURE_1D, format WINED3DFMT_R32_FLOAT, multisample_type 0, multisample_quality 0, usage WINED3DUSAGE_RENDERTARGET | WINED3DUSAGE_TEXTURE, access WINED3D_RESOURCE_ACCESS_GPU, width 1024, height 1, depth 1, layer_count 1, level_count 1, flags 0, ...
Destination texture init:
d3d11_device_CreateTexture1D iface 0xc1dc0, desc 0xc46f150, data (nil), texture 0xaeb050 wined3d_texture_init texture 0x16f01ab0, resource_type WINED3D_RTYPE_TEXTURE_1D, format WINED3DFMT_R32_FLOAT, multisample_type 0, multisample_quality 0, usage 0, access WINED3D_RESOURCE_ACCESS_CPU | WINED3D_RESOURCE_ACCESS_MAP_R, width 1024, height 1, depth 1, layer_count 1, level_count 1, flags 0, ...
Copy:
d3d11_immediate_context_CopyResource iface 0xc1df0, dst_resource 0x16f01a40, src_resource 0x61ef6950 wined3d_device_copy_resource device 0xd9740, dst_resource 0x16f01ab0, src_resource 0x61ef5400 wined3d_cs_run Executing WINED3D_CS_OP_BLT_SUB_RESOURCE texture2d_blt dst_texture 0x16f01ab0, dst_sub_resource_idx 0, dst_box (0, 0, 0)-(1024, 1, 1), src_texture 0x61ef5400, src_sub_resource_idx 0, src_box (0, 0, 0)-(1024, 1, 1), flags 0x20000000, fx 0xc6d0494, filter WINED3D_TEXF_POINT
The reason for using glReadPixels (e.g. texture2d_read_from_framebuffer) in my patch is because of AMD + Mesa. Based on the Mesa source code and empirical testing, GetTexImage (implemented with GetTexSubImage) does not have any special support on AMD, unlike i965: ./mesa/drivers/dri/i965/intel_tex_image.c: functions->GetTexSubImage = intel_get_tex_sub_image;
whereas glReadPixels has support on AMD that prevents the GPU synchronization: ./mesa/drivers/dri/i965/intel_pixel.c: functions->ReadPixels = intelReadPixels; ./mesa/drivers/dri/i915/intel_pixel.c: functions->ReadPixels = intelReadPixels; ./mesa/drivers/dri/r200/r200_state.c: functions->ReadPixels = radeonReadPixels; ./mesa/drivers/dri/radeon/radeon_state.c: ctx->Driver.ReadPixels = radeonReadPixels;
The default implementation of GetTexSubImage will resort to mapping the PBO and doing a copy on the CPU side. The radeon implementation of ReadPixels does the expected thing of asking the GPU to copy the texture into the PBO.