https://bugs.winehq.org/show_bug.cgi?id=47998
Bug ID: 47998 Summary: Better deal with random test failures Product: Wine-Testbot Version: unspecified Hardware: x86 OS: Linux Status: NEW Severity: normal Priority: P2 Component: unknown Assignee: wine-bugs@winehq.org Reporter: fgouget@codeweavers.com Distribution: ---
Some tests fail randomly. They should be fixed of course but the in the meantime the TestBot should try not to report them as new failures when they happen.
To detect new errors the TestBot compares a task's test result with that of the latest WineTest run. So if a failure did not happen during the last WineTest run and then happens when testing a patch it will be reported as a new failure.
To avoid that the TestBot should take into account not just the latest WineTest report, but all the available test reports. This way, if the random failure happened in any of those runs it will be reported as pre-existing as expected.
How much of a history to take into account can then be adjusted by changing $JobPurgeDays, or adding a specific setting.
Note that this means if a failure is fixed and is reintroduced soon after it will not be reported as a new failure. This scenario should be rare enough to not be an issue in practice.
To implement this: 1. Store the WineTest reports in the var/latest directory with the following naming format: <vmname>-job<jobid>_<stepno>_<taskno>-<report>
2. At the start of WineRunTask and WineRunWineTest delete any reference report in the task's directory (in case the task is restarted), then make new hard links to the current set of reference reports. Handle this in LogUtils::GrabReferenceReports() so this code is shared.
3. Add LogUtils::AddReferenceReport() to deal with copying the WineTest reports to var/latest. Call this function when WineRunTask and WineRunWineTest complete.
4. In GetNewLogErrors() initially mark all errors as new. Then diff the current report with each of the reference reports located in the task's directory in turn and remove any error that's not new from the set of new errors.
Note that currently the reference logs are simple copies of the original WineTest report. This means these are large files which must be parsed again to extract errors. With the current $JobPurgeDays setting there will be around 20 reference logs which will require 20 times as much log parsing. So there are two optimizations one can do, both happening when adding a new reference report (i.e. in AddReferenceReport()):
a. Instead of copying the full report, save only the errors. That's all the diff needs and this should reduce the size of the files by a factor of 10 (and thus speed up parsing).
b. After saving a new reference file, diff it against the old reference files. Delete any old reference file that has no error not already present in the new reference file. This will not help much if the set of failures is different with every run. But otherwise this will speed up both the parsing and diffing.
https://bugs.winehq.org/show_bug.cgi?id=47998
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|wine-bugs@winehq.org |fgouget@codeweavers.com
https://bugs.winehq.org/show_bug.cgi?id=47998
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |critical
https://bugs.winehq.org/show_bug.cgi?id=47998
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #1 from François Gouget fgouget@codeweavers.com --- This is fixed, including for the deduplication part. There are a couple caveats though:
* There are failures rare enough they are not present in a given configuration's WineTest history, but frequent enough that they will regularly happen on jobs running the test on a dozen or more configurations. The only solution for those is to fix the test.
* There are a lot of 'always new' failures, that is where the failure message changes every time thus preventing the TestBot from detecting that they happened before. Those cause both false positives and also prevent the deduplication from being effective. The fix is to either remove the variable part (typically a pointer or handle value) or to enclose it in delimiters that the TestBot can recognize see bug 48209).
commit 8c6d8e66595d376adbe3cb1f199dbb69331739a8 Author: Francois Gouget fgouget@codeweavers.com Date: Mon Feb 24 04:45:23 2020 +0100
testbot/LogUtils: Deduplicate the latest WineTest reports.
There is no need to keep old logs if they don't contain errors that are already present in the latest one. It cuts down on the number of reports that test results need to be compared to to detect new failures.
Signed-off-by: Francois Gouget fgouget@codeweavers.com Signed-off-by: Alexandre Julliard julliard@winehq.org
commit da678a9ebfe3862378543e80b03cd7c5496334cc Author: Francois Gouget fgouget@codeweavers.com Date: Tue Feb 18 17:33:09 2020 +0100
testbot: Take into account multiple WineTest reports.
Instead of keeping only the latest WineTest results, keep all of them and only consider an error new if it does not appears in one of the WineTest results. This minimizes the risk of tagging a failure as new when it happens intermitently. However it also means the number of reference WineTest results can grow without bounds so purge them after $JobPurgeDays. Since all reference reports end up in the latest/ directory this also makes it unnecessary to go fishing for them in the jobs/ directory when running UpdateTaskLogs on a specific job. Also filtering reference reports to only use those older than the task can be done based on the reference report mtime instead of having to keep trust the jobid order.
Note: UpdateTaskLog can optionally be run to update the new/old status in the errors cache files.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=47998 Signed-off-by: Francois Gouget fgouget@codeweavers.com Signed-off-by: Alexandre Julliard julliard@winehq.org
https://bugs.winehq.org/show_bug.cgi?id=47998
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #2 from François Gouget fgouget@codeweavers.com --- This is one. There are still some failures that are causing trouble, either because they are too rare or because the message changes every time. But those are the subject of bug 48912.
So I'm closing this bug.