CodeWeavers has two machines that I have been running WineTest on, in Linux, Windows 8.1 and Windows 10, in 32 and 64 bit. They are identical except for the graphics cards: AMD HD6800 for cw1 and Nvidia GTX560 for cw2.
During WineConf we decided that these should be able to run WineTest on Linux without any errors and thus become reference WineTest machines.
We are not quite there yet so here is the current status starting with the 32 bit tests.
* crypt32:chain Some of the certificates used by this test are present on Windows but missing on Linux, thus causing the error. Hans is looking into it and either the test will be modified to use other certificates present on both platforms, or the certificates will be added to these boxes. In the latter case this will be documented on the Wiki: https://wiki.winehq.org/Conformance_Tests#Running_WineTest_in_Wine
* d2d1:d2d1, d3d10core:device These tests only fails on the GTX560. The reason is unknown. We need someone to look into it.
* d3d8:device, d3d9:d3d9ex, d3d9:device, d3d9:visual Matteo fixed the radeon driver configuration and is looking into these failures, at least for the HD6800 case.
It is possible some of these failures are caused by the window manager (currently xfwm from Xfce). Replacing the windows manager is no issue, we just need to know which one to pick. fvwm2?
* kernel32:console, kernel32:process These failures happen because the tests are being run from cron, with their output redirected to a log file and thus they have no real terminal to work with. Interestingly the 64 bit tests are not failing. Vincent is investigating whether the tests should be modified or whether the test should be run from an xterm, screen session or some equivalent.
* user32:winstation winstation.c:959: Test succeeded inside todo block: unexpected foreground window (nil) The cause for this failure has not been identified. Another window manager issue?
* ws2_32:sock sock.c:2811: Test succeeded inside todo block: Test[2]: expected 0, got 0 We need someone to investigate this failure. It looks like it's random and relatively rare.
Some tests just cannot be made to work reliably on Linux due to the asynchronous nature of X11 messaging. So another decision that was made that tests should be run up to 3 times until they succeed. It's not entirely clear if that applies to all tests or just to specific tests that are known to be flaky.
The practical upshot is that people running 'make test' can re-run it a couple of times if it fails at first, and if it succeeds after 3 tries then you can consider that it's all good.
These machines run WineTest so the results can be uploaded to test.winehq.org (with the relevant header describing the machine, the dlls, etc). So WineTest may need to be updated to deal with flaky tests according to policy.
Here is more detail about these machines: * i7-2600K (4 cores+hyperthreading at 3.4GHz) * 16 GB of RAM * 768 GB of spinning disk * Debian 8.2 64 bit * cw1: AMD HD6800 running the open-source radeon driver. * cw2: Nvidia GTX560 running the proprietary driver.