https://bugs.winehq.org/show_bug.cgi?id=47998
Bug ID: 47998 Summary: Better deal with random test failures Product: Wine-Testbot Version: unspecified Hardware: x86 OS: Linux Status: NEW Severity: normal Priority: P2 Component: unknown Assignee: wine-bugs@winehq.org Reporter: fgouget@codeweavers.com Distribution: ---
Some tests fail randomly. They should be fixed of course but the in the meantime the TestBot should try not to report them as new failures when they happen.
To detect new errors the TestBot compares a task's test result with that of the latest WineTest run. So if a failure did not happen during the last WineTest run and then happens when testing a patch it will be reported as a new failure.
To avoid that the TestBot should take into account not just the latest WineTest report, but all the available test reports. This way, if the random failure happened in any of those runs it will be reported as pre-existing as expected.
How much of a history to take into account can then be adjusted by changing $JobPurgeDays, or adding a specific setting.
Note that this means if a failure is fixed and is reintroduced soon after it will not be reported as a new failure. This scenario should be rare enough to not be an issue in practice.
To implement this: 1. Store the WineTest reports in the var/latest directory with the following naming format: <vmname>-job<jobid>_<stepno>_<taskno>-<report>
2. At the start of WineRunTask and WineRunWineTest delete any reference report in the task's directory (in case the task is restarted), then make new hard links to the current set of reference reports. Handle this in LogUtils::GrabReferenceReports() so this code is shared.
3. Add LogUtils::AddReferenceReport() to deal with copying the WineTest reports to var/latest. Call this function when WineRunTask and WineRunWineTest complete.
4. In GetNewLogErrors() initially mark all errors as new. Then diff the current report with each of the reference reports located in the task's directory in turn and remove any error that's not new from the set of new errors.
Note that currently the reference logs are simple copies of the original WineTest report. This means these are large files which must be parsed again to extract errors. With the current $JobPurgeDays setting there will be around 20 reference logs which will require 20 times as much log parsing. So there are two optimizations one can do, both happening when adding a new reference report (i.e. in AddReferenceReport()):
a. Instead of copying the full report, save only the errors. That's all the diff needs and this should reduce the size of the files by a factor of 10 (and thus speed up parsing).
b. After saving a new reference file, diff it against the old reference files. Delete any old reference file that has no error not already present in the new reference file. This will not help much if the set of failures is different with every run. But otherwise this will speed up both the parsing and diffing.