On Monday, 23 October 2023 07:29:50 CDT Francois Gouget wrote:
Another part of the naive thinking was that if developers did not work on the tests it was just because the tools needed to be improved or the test environments were too unreliable. So for a long time all efforts have centered on that. But I think the CI is good enough now to not be the main obstacle [1].
I would agree that the tools are perfectly fine nowadays (well, except for the abandonment of the TestBot... but in terms of finding bugs to fix, the tools are fine).
I do quite like the patterns page. Though at this point, when I take time to fix tests, I find what's most helpful is the already filed bugs. How much effort do those bugs take to file? Is that sustainable (and considering other tasks on your plate)? Is that a job you're comfortable with doing?
So the conclusion I have come to is that making further and lasting progress will require policy changes to incentivize work on the tests. This touches on the domain of social sciences so there will be no obvious 'right fix'. It's also something I cannot do myself since I don't have any authority to make policy changes.
Anyway, here are a few ideas, including some extreme ones, and hopefully we can decide on something that works for Wine:
Revert commits that introduced new failures.
- Do it the very next day if the failure is systematic?
- What if the failure is random and only happens once a day? Or once a week?
- What if the failure does not impact the CI? For instance if the CI has no macOS test configuration and the failure only happens on macOS.
- Should only the test part of the MR be reverted? (if that's the cause of the failure)
- Who makes the decision to revert? Alexandre? A dedicated person who will catch a lot of flak?
Block an author's new MRs if they did not fix failures introduced by one of their previous commit.
- This has the potential to slow down Wine development.
- Or the author could request their previous commit to be reverted to get unblocked.
If the CI shows failures, block the MR.
- That can still cause the Wine development to halt when the CI has a 100% failure rate (as has been the case for the GitLab CI recently).
- So it's only doable if the false positive rate is pretty low. But then it's likely to just result in the developer trying their luck with the next CI run.
I think we need something along the lines of blocking new patches, or reverting blamed commits. Nothing less drastic has worked.
There are a lot of tests that are caused by unknown causes, though. Whose responsibility is it to fix them? Blocking *all* merge requests on those grounds doesn't solve that problem, but we will need a strong enough incentive to make sure that that person fixes the bugs.
- If the CI shows failures, require that the author explain why they are not caused by their MR before it can be considered for merging.
- The TestBot's new failure lookup tool would be a good way to prove the failure pre-existed the MR. https://testbot.winehq.org/FailuresList.pl
- This is a softer version of the previous option and should not block Wine development. It may also push developers to fix the failures out of frustration at having to explain them away again and again since they cannot ignore them.
- Reviewers would also be responsible for verifying that the explanations are accurate and for objecting to the merge if not.
- Determining if the CI results contain new failures would not longer fall on Alexandre alone.
I think we had a (unofficial?) policy along these lines for some time. It may have helped avoid regressions, but it didn't result in existing failures geting fixed.
- Have someone dedicated to tracking the source of new failures, reporting them to the authors, following up on the failures, asking for progress on a fix, etc. This would be a developer role bordering on community manager.
I suppose you've been doing at least part of that, but even with a stronger role, I suspect we also need a way to prevent developers from saying "I don't have time to fix that".
Use the Wine party fund to pay developers to fix test bugs.
Send swag to developers who fixed 10 or more test failures. Or set rewards for fixing specific test units like user32:msg, d3d11:d3d11, etc.
At least personally this will be no motivation at all.
I do really want to fix tests, but I find it hard to justify spending work time to fix them in most cases, and while I do work on Wine in my free time, tests are not the only thing I want to spend time on.
- Point test.winehq.org/ to the patterns page instead of the index page: the patterns page better reflects the tests progress [4] and thus is less discouraging than the main index page. https://gitlab.winehq.org/winehq/tools/-/merge_requests/71
I think this should be done, it's a more useful view in general.
--Zeb