https://bugs.winehq.org/show_bug.cgi?id=56179
Bug ID: 56179 Summary: GitLab fails to run some tests on Windows due to lack of PARENTSRC support Product: WineHQ Gitlab Version: unspecified Hardware: x86-64 OS: Windows Status: NEW Severity: normal Priority: P2 Component: gitlab-unknown Assignee: wine-bugs@winehq.org Reporter: fgouget@codeweavers.com
GitLab does not take PARENTSRC into account when running the Windows tests.
For instance MR!4837 introduces a failure in xaudio2_8:xaudio2: https://gitlab.winehq.org/wine/wine/-/merge_requests/4837/diffs
And yet the 32-bit Windows job shows no failure: https://gitlab.winehq.org/fgouget/wine/-/jobs/48209
While the 32-bit Linux job shows the expected failure: https://gitlab.winehq.org/fgouget/wine/-/jobs/48208
This is actually obvious given that the code that builds the list of tests to run on Windows (winetest.args) contains no reference to PARENTSRC:
- git diff --name-only $CI_MERGE_REQUEST_DIFF_BASE_SHA | sed -re '//tests//!d; s@/tests/.*@/tests/Makefile.in@' | (xargs -r ls 2>/dev/null || true) | xargs -r sed '/TESTDLL/!d; s@.dll@@; s@.*= *@@' >usr/local/share/wine/winetest.args
https://bugs.winehq.org/show_bug.cgi?id=56179
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |rbernon@codeweavers.com
--- Comment #1 from François Gouget fgouget@codeweavers.com --- The way the list of tests to run on Windows is built is really nice and compact. However, when I was working on the TestBot, I was told that "parsing the diff is a hack. The right way(tm) to figure out which tests to run on Windows is to just build and see what changed".
Of course that's easier said than done when all your CI jobs start from a clean source tree (no past object files), as is the case for GitLab.
Here are some ways this could be done anyway:
1. Build twice First build without the patch, grab the timestamps, then again with the MR applied and compare the timestamps to figure out which ones changed. Of course that's very inefficient, even with ccache.
2. Use a copy-on-write overlay This is essentially what the TestBot is doing: it rebuilds Wine after a Git push and takes a snapshot of the built state. Individual patches are then built from this state so one could actually compare the before and after timestamps. Unfortunately, as far as I know, GitLab does not provide a way to stack filesystems so that an old build could serve as a read-only base. Maybe this could be emulated by using a cache: - Do out-of-tree builds where the build directory is placed in a cache. - During the reference build use the cache in read-write mode (the default), which will populate the cache with the binary files from the pristine source. - It's also necessary to preserve the source file timestamps so a later Git checkout does not cause all a full rebuild, ruining our attempt to preserve the binary timestamps. - When testing MRs use the cache in read-only mode using 'cache:policy:pull'. - Restore the source timestamps on the still pristine source. - Then apply the MR and build. - See which binary timestamps changed (it's probably not even necessary to save the original binary timestamps, just check which binaries have a timestamp newer than the start of the build).
3. Use reproducible builds First modify Wine to support a reproducible build mode. https://en.wikipedia.org/wiki/Reproducible_builds - Use the reproducible build mode during a reference build and save the binary checksums in a cache. - After building the MR, compute the new checksums and compare them to those in the cache. Any changed checksum indicates a binary that was impacted by the MR so the corresponding test(s) need to be re-run. Note: The cache can be made read-only using cache:policy:pull.
The next issue is: which binaries should be checked for changes, the test executables or the individual object files? * A modified test executable does not tell us which individual test units need to be rerun. Some test executable have 30+ test units so rerunning them all can be quite inefficient. * When an object file (or compiled resource file) is modified it is not always obvious which test unit should be rerun. This is the case for tests that have C files for drivers, helper binaries or resources. So without some way to identify which is which this can devolve to running all that module's test units anyway.
At least this last issue is mostly an optimization, though it can impact how many tests can be run with the available resources, and thus test coverage.