Re: TestBot News - MR & WineTest false positives

15 Mar 2023


      Here's an update on the merge request and nightly Wine test runs false 
positive rates.
Reminder: A false positive (FP) is when the TestBot or GitLab CI say a 
          failure is new when it is not.
* TestBot
  The FP rate is still between 5 and 10% (see attached graphs). Now we 
  have more history data so we can see that the FP rate went steadily 
  down from 20% to 5% in December, i.e. during the freeze and when I was 
  first populating the TestBot's list of known failures.
  https://testbot.winehq.org/FailuresList.pl
Then in the month of January the average rate gradually went back up 
  to about 10%. I chalk it up to more risky commits being allowed again.  
  It would be nice for the FP rate to go back down to 5% but it's not 
  clear if that will happen.
* GitLab CI
  The GitLab CI's FP rate also went down in December, hitting a low of 
  10% for the new year. But in January it immediately went up again. 
  Combined with the high November FP rate, the December dip is not 
  really visible on the 5 week average.
As I said, the FP rate has been going up since the new year. Again I 
  think that's the effect of more risky commits going in. That shows on 
  the 5 week average which is now between 25% and 30%, higher than ever 
  before :-(:
Unlike on the TestBot, the GitLab CI has no way to ignore known 
  false positives. So if you don't want the GitLab CI claiming your 
  merge requests introduce new failures, the only way is to fix the 
  tests. And I guess that's not a bug. It's a feature [1].
Where to start you may ask?
A good place would be the test units that cause the most false 
  positives:
22 dinput:device8
     17 ntdll:threadpool
     16 user32:msg
      9 d3d11:d3d11
      7 ws2_32:afd
      6 ws2_32:sock
      6 user32:win
      6 ole32:clipboard
And among those, some failure modes are particularly troublesome:
17 dinput:device8   -> bug 54594
     16 ntdll:threadpool -> bug 54064
      9 d3d11:d3d11      -> bug 54510
      8 user32:msg       -> bug 54037
      7 ws2_32:afd       -> bug 54113
      6 ole32:clipboard  -> bug 54005
That is, user32:msg, for instance, can fail in many different ways but 
among the 16 times it caused a false positive (first list), 8 of them 
were because of the specific failure described in bug 54037 (second 
list).
[1] Not that it ever worked for the TestBot.
-- 
Francois Gouget fgouget@codeweavers.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: TestBot News - MR & WineTest false positives