assuming that this test is indeed useful in the first place, which it indeed may not be.
I've taken some advice from @huw on this one. We decided the tests may be worth keeping to validate the Wine implementation, but I've added a new column to record the `studio_value` and I use this value to mark a test broken; even though technically it's not broken, it's just the graphics card has been set-up differently.
I had to raise the tolerance a bit (4 seems to work)
I've increased the tolerance for Windows to allow 4. In that the test will pass on Wine and Windows if the tolerance is within 1, but it will be marked broken on Windows if between 1 and 4 (and fail on Wine). The tests are a bit different to usual for this one, I guess because we're not testing against core Windows behavior, but against the settings and implementation of the graphics card driver. But the tests now ~~pass~~ don't fail on the different test bot systems.
It also succeeds YV12 surface creation but fails the subsequent Blt(), so we'll probably need to handle that
I now handle this scenario by marking the test as broken and skipping the rest of the tests for that texture.
Some tests actually pass with the current implementation, so I do need a `todo_wine_if`.
Oops, I misread, sorry.
I decided the tests were a bit difficult to read, so I've reformatted and used the member names of the struct during initialization.