I'm not completely sure about the mechanism, but I think it's simple enough to consider having that upstream now. This shows at least how we can leverage win32u surface changes to decide to switch surfaces on/off-screen and fallback to manual blitting.
Having the surfaces on-screen makes sure they are presented as efficiently as possible, having them off-screen we use GDI blit to indirectly call XCopyArea and this will be suboptimal, probably with visible tearing, but hopefully not too common.