On Wed Apr 5 19:10:43 2023 +0000, Rémi Bernon wrote:
we don't want to transfer from CPU to GPU also if wg_transform didn't
really return any samples (which is the case in at least half of invocations when it needs more input data). That is, if we the sample is not going to be returned we don't want to lock dxgi buffer at all, as there is no way to tell it not to perform any texture update at all. But we still need to pass some buffer as we don't know upfront if the data will be used or not; I see... then it probably makes sense if there's no way to discard the D3D locking. I'm not a huge fan of the amount of code required to create and copy the temporary buffer, so probably having this factored out in a common part would be better. Or maybe this should just use an internal `MFCreateSampleCopierMFT` to do the dirty job. This also could an opportunity to improve the sample copier for video buffers if Lock2D is more efficient.
Sample copier looks related indeed... I guess it is mfplat/sample.c:sample_CopyToBuffer() which could use some optimization for 2d destination buffer (so it locks it with Lock2DSize to pass the flags and avoids both pulling the data from GPU for dest buffer and creating a temporary linear buffer). But that is probably for another MR? Using the temporary buffer and sample copier here will remove extra copies when there is no output samples, while leave GPU -> CPU copies.