Thanks for the patch!
I don't want to request huge changes, and there's nothing prohibitively wrong with these tests, but I can see a things that would be improvements, if you're up for it:
(1) Move them to d3d11. Yeah, it's a dxgi feature, but it uses d3d11-specific code anyway, including a lot of stuff around device creation and readback that you've had to just duplicate here. I think it'd probably make more sense to just put the tests there.
(2) There are typically two ways that we write tests for nonuniform colors, and I think either one would probably be a bit more readable than this: either use d3d11-style functions that take a whole rect, or open-code the loop like in d3d11's test_texture_compressed_3d(). The whole "test a 2x2 region all at once" I see where it's coming from, but I don't think it's the most readable.
Some more minor comments:
(1) Why are you changing the window size between swap-effect tests?
(2) Those other parameters could probably just go into the tests themselves since they're not being changed either.
(3) What does test_partial_present_grid() demonstrate that test_partial_present_scroll() doesn't? I don't want to say that we don't want redundancy, but the former is distinctly less easy to read and I'm not sure if there's a reason it's helping other than redundancy.