we don't want to transfer from CPU to GPU also if wg_transform didn't really return any samples (which is the case in at least half of invocations when it needs more input data). That is, if we the sample is not going to be returned we don't want to lock dxgi buffer at all, as there is no way to tell it not to perform any texture update at all. But we still need to pass some buffer as we don't know upfront if the data will be used or not;
I see... then it probably makes sense if there's no way to discard the D3D locking.
I'm not a huge fan of the amount of code required to create and copy the temporary buffer, so probably having this factored out in a common part would be better. Or maybe this should just use an internal `MFCreateSampleCopierMFT` to do the dirty job. This also could an opportunity to improve the sample copier for video buffers if Lock2D is more efficient.