tl;dr; Failures not caused by my patch!
First the reason why the TestBot reran all of the kernel32 tests on Wine is that I patched Wine's kernel32 dll. The TestBot does not really understand what the patch does so it assume that could break any of the dll's tests.
In fact a change in kernel32 could break pretty much any Wine test so to be thorough all the tests should be rerun. But while technically the TestBot has been able to do this for over a year, rerunning all the tests would take a lot of time (requiring more hardware resources), and, more importantly, as this case shows the quality of the Wine tests is just too poor for this to be workable.
Marvin wrote:
=== debian10 (32 bit report) ===
[...]
debugger: Timeout
I'm starting the analysis with this timeout. It's intermittent. The exception that preceded the timeout is pretty frequent, but the timeout happens less frequently. The last time it happened in this VM configuration was on October 31st: https://test.winehq.org/data/ccec532879ec14b2e79da08288152a69221ec4d1/linux_...
It is to avoid these false positives I'm working on bug 47998. https://bugs.winehq.org/show_bug.cgi?id=47998
[...]
kernel32: change.c:320: Test failed: should be ready
This failure too is intermittent: https://test.winehq.org/data/tests/kernel32:change.html
But it's much rarer. So much so that it never happened in this VM configuration, i.e. debian10 + win32 build + English, in the last 29 WineTest runs on record.
It did happen four times in other configurations though: Oct 31 - debian10 + win32 build + French Oct 27 - debian10 + win32 build + Japanese Oct 22 - debian10 + wow64 build + English Sep 25 - debian10 + win32 build + Japanese
Based on the test it does not look like the language would have an impact on this test so this failure could really happen at any time in any kernel32:change run.
But the TestBot cannot know that and, since this failure did not happen in this specific configuration in the history available to the TestBot, a fix for bug 47998 would not help with this false positive.
Sometimes fixing an intermittent failure is the only solution for cleaner test runs.
=== debian10 (64 bit WoW report) ===
kernel32: debugger.c:320: Test failed: GetThreadContext failed: 5 debugger.c:320: Test failed: GetThreadContext failed: 5
This might be confusing. The TestBot used the November 5 WineTest run as a reference which did have two of these failures. So why is it reporting these two failures as new?
That's because this run has 4 such failures in total. So the TestBot ignores the first two but correctly reports the next two as new.
In fact this failure happens a variable number of times because it's called for each of the process' threads and fails randomly. https://source.winehq.org/git/wine.git/blob/5b62f89baa82daecd430897de0bb5cab...
On Oct 28 it happened 5 times so a fix for bug 47998 would have avoided this particular false positive. But there's always the risk of having one more failure than the historical record.
Should the TestBot deduplicate the failure lines to avoid such issues? (*)
Duplicate failures (identical line number and failure) typically happen when a test has an array containing test information and loops on it. In such a case a patch adding a new entry that causes an extra failure should be flagged which would not happen if the TestBot is deduplicating the failures.
But such loops should contain the index of the tested array entry in their message which would thwart the deduplication.
The kernel32:debugger test is a special case in that there does not seem to be a meaningful way to differentiate the failures so it's a good case for deduplicating.
(*) If it is to be implemented the deduplication should be performed while extrating the failures from the test report. That's because once the failures have been extracted, any trace that we present between them has been lost. But such traces provide context and thus should prevent the deduplication from happening. This is also a place where line numbers can be taken into account (they are ignored when comparing failures since patches will change them).