Hi,
Il 11/02/22 09:17, Nikolay Sivov ha scritto:
I don't think this works. It should be possible to LockRect() after buffer was locked, if you keep surface locked this won't work. I don't think we have tests for that, but that's what quick testing on Windows shows.
Ouch. So Microsoft really does two more copies per Lock()? Bad!
Maybe the thing we could do anyway is to make call the first LockRect() with D3DLOCK_READONLY and the second one with D3DLOCK_DISCARD, though I doubt it will save anything near 3 ms per frame.
It would be nice if the Lock() interface had something similar to DISCARD, but alas it doesn't.
We could have a shortcut in MFCopyImage() first, to have a single copy call when strides match, instead of calling per row. Next step could be to have some SIMD variants, with non-temporal copy like docs suggest. No idea how much this improves performance, but for large enough copies it's meant to bypass cache at least, I think.
I tried the single copy thing, but I couldn't see any significant change, so I didn't even bother submitting. As for the non-temporal copy, I don't know much about it either, but something makes me feel it won't change that much either.
Thanks, Giovanni.