On Thu Jul 20 12:55:28 2023 +0000, Henri Verbeet wrote:
This function is the main reason for much lower performance vs Windows
in HZD at least. Not loading device from dst_heap when we can have it in a register is an advantage. I've added a comment to clarify. In that case, should we be calling d3d12_desc_copy() in a loop from device.c at all? Or would it be better to e.g. introduce a "d3d12_descriptor_heap_copy()" function that takes care of that loop and inlines d3d12_desc_copy() (and quite possibly d3d12_desc_write_atomic())? We may also want to consider placing "device" in the same cacheline as "use_vk_heaps" inside struct d3d12_descriptor_heap in that case.
Yes I think it will benefit from another patch series to optimise copying. It may also be worth storing texture views in their respective resources to eliminate refcounting of texture views.