Re: A case for policy changes to get to zero test failures

23 Oct 2023

      On Monday, 23 October 2023 07:29:50 CDT Francois Gouget wrote:
...
Another part of the naive thinking was that if developers did not work 
on the tests it was just because the tools needed to be improved or the 
test environments were too unreliable. So for a long time all efforts 
have centered on that. But I think the CI is good enough now to not be 
the main obstacle [1].
I would agree that the tools are perfectly fine nowadays (well, except for the abandonment of the TestBot... but in terms of finding bugs to fix, the tools are fine).
I do quite like the patterns page. Though at this point, when I take time to fix tests, I find what's most helpful is the already filed bugs. How much effort do those bugs take to file? Is that sustainable (and considering other tasks on your plate)? Is that a job you're comfortable with doing?
...
So the conclusion I have come to is that making further and lasting 
progress will require policy changes to incentivize work on the tests. 
This touches on the domain of social sciences so there will be no 
obvious 'right fix'. It's also something I cannot do myself since I 
don't have any authority to make policy changes.
Anyway, here are a few ideas, including some extreme ones, and hopefully 
we can decide on something that works for Wine:

Revert commits that introduced new failures.

Do it the very next day if the failure is systematic?
What if the failure is random and only happens once a day? Or once a 
week?
What if the failure does not impact the CI? For instance if the CI 
has no macOS test configuration and the failure only happens on 
macOS.
Should only the test part of the MR be reverted? (if that's the 
cause of the failure)
Who makes the decision to revert? Alexandre? A dedicated person who 
will catch a lot of flak?

Block an author's new MRs if they did not fix failures introduced by 
one of their previous commit.

This has the potential to slow down Wine development.
Or the author could request their previous commit to be reverted to 
get unblocked.

If the CI shows failures, block the MR.

That can still cause the Wine development to halt when the CI has a 
100% failure rate (as has been the case for the GitLab CI recently).
So it's only doable if the false positive rate is pretty low. But 
then it's likely to just result in the developer trying their luck 
with the next CI run.

I think we need something along the lines of blocking new patches, or reverting blamed commits. Nothing less drastic has worked.
There are a lot of tests that are caused by unknown causes, though. Whose responsibility is it to fix them? Blocking *all* merge requests on those grounds doesn't solve that problem, but we will need a strong enough incentive to make sure that that person fixes the bugs.
...

If the CI shows failures, require that the author explain why they are 
not caused by their MR before it can be considered for merging.
The TestBot's new failure lookup tool would be a good way to prove 
the failure pre-existed the MR.
https://testbot.winehq.org/FailuresList.pl
This is a softer version of the previous option and should not block 
Wine development. It may also push developers to fix the failures 
out of frustration at having to explain them away again and again 
since they cannot ignore them.
Reviewers would also be responsible for verifying that the 
explanations are accurate and for objecting to the merge if not.
Determining if the CI results contain new failures would not longer 
fall on Alexandre alone.

I think we had a (unofficial?) policy along these lines for some time. It may have helped avoid regressions, but it didn't result in existing failures geting fixed.
...

Have someone dedicated to tracking the source of new failures, 
reporting them to the authors, following up on the failures, asking 
for progress on a fix, etc. This would be a developer role bordering 
on community manager.

I suppose you've been doing at least part of that, but even with a stronger role, I suspect we also need a way to prevent developers from saying "I don't have time to fix that".
...

Use the Wine party fund to pay developers to fix test bugs.

Send swag to developers who fixed 10 or more test failures. Or set 
rewards for fixing specific test units like user32:msg, d3d11:d3d11, 
etc.

At least personally this will be no motivation at all.
I do really want to fix tests, but I find it hard to justify spending work time to fix them in most cases, and while I do work on Wine in my free time, tests are not the only thing I want to spend time on.
...

Point test.winehq.org/ to the patterns page instead of the index page: 
the patterns page better reflects the tests progress [4] and thus is 
less discouraging than the main index page.
https://gitlab.winehq.org/winehq/tools/-/merge_requests/71

I think this should be done, it's a more useful view in general.
--Zeb

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: A case for policy changes to get to zero test failures