Hey all,
I thought I would start my annual harangue a little bit early this year. I thought I'd summarize the lens I see things with and then see if there is anything constructive we can do now, and then again when we meet in person. Then we'll also have drink to turn to if we end with nothing but despair.
The ideal is that this page: http://test.winehq.org/data/ be covered entirely in green. That would indicate that our unit tests ran successfully on all tested systems. A further ideal is that it would have 'Mac' and 'Android' columns that were also green. The holy dream we all crave is that a 'make test' would work in a rational fashion. And the ultimate fantasy is that every patch sent would be tested not only on Windows if the tests change, but on Linux, Mac, and Android as well. (Alexandre does that by hand now, but it'd be nice to automate that test).
A more reasonable ideal is that all of our tests would run successfully on a well curated list of 'rigorous' test machines. I don't believe that we have an official page for that; I maintain an unofficial one here: https://www.winehq.org/~jwhite/latest.html That is all of the 'newtb' Windows VMs excluding Windows 2000, Windows 8, and Windows 10.
That list of failures has fluctuated; starting at about 40, and once getting down to as few as a dozen. It now stands at about 20, where it's been for a while. Nicely (?), we're down to only intermittent failures :-/.
I think the instinct has been to fix all of the Windows tests; that once those are consistently green, then it would make sense to go after a well defined Linux rig and push it to green.
CodeWeavers has a rack of hardware and we're happy to put any flavor of system in there (and have done so, with standard rigs for testing AMD and NVidia Linux boxes). A quick scan suggests that those rigs stand at about the same number of failures; low to mid teens.
I think we are lacking several things:
1. Some help for Francois. He's basically doing all of this on his own. We could use some people willing to fight through Perl to help extend our capabilities.
2. The ability (will?) to drive the Windows tests to green. Is it time to articulate a kind of test that is expected to periodically fail? In other words, do we have tests that 'reasonably' fail, and so we should redefine that failure?
Many years in the past, we had constructive sessions where I forcibly prevented people from going to the pub until we flipped a whole lot more bits to green. I can remember times when that worked quite nicely. But we haven't been as productive with that lately. Is it worth dedicating another block of time to it at the Wine conference? Would it work better if we prepared ahead of time?
For example, let me suggest this: that developers start right now running winetest on machines they plan to bring to Wineconf. Then we can have that body of information available to us at the conference, and we can choose to attack issues that show commonality.
Does that make sense?
Any other prep work we should do? (Jacek and Piotr assure me that there will be plenty of beer for drowning sorrows, so that seems to be covered).
Cheers,
Jeremy