On Mon, 23 Oct 2023, Zeb Figura wrote: [...]
I do quite like the patterns page. Though at this point, when I take time to fix tests, I find what's most helpful is the already filed bugs. How much effort do those bugs take to file?
It can be quite time consuming depending on the the volume of new failures and whether they are hard to track down. The procedure I go through looks something like this:
* First I try to identify a group of related failures. Usually that's easy but it can be confusing when there are a lot of non-systematic new failures mixed with lots of pre-existing failures. Also I sometimes don't know enough about the test to know if it will be possible to fix all the failures in one go or if some will require a separate fix. I usually try to err on the side of not mixing things up. Developers should feel free to mark bugs as duplicates when appropriate.
* How to reproduce the bug - That can be tricky when the test does not fail on its own because then I have to figure out which other test is interfering (and that can be a dead end). - Also when the test does not always fail bisects get more complicated.
* Identify the commit that caused the test to fail. - Only doable for the machines I have access to. That makes macOS failures, for instance, easier to deal with since I can just skip this step (and many others). - But identifying the commit helps figure out who is most likely to know what's going on and how to fix the issue so I feel it's an important step.
* Identify the date of the first failure. - Sometimes it's obvious from the test pattern page. - But when the test unit already has lots of failures I grep a mirror of the test.winehq.org reports (sorted by date). (I also use the mirror to build myself a patterns page with 8 months of history).
* And then there is the question of identifying which tests need to be looked at: - I scan all the TestBot's WineTest job reports (ideally daily and update failures-winetest.txt). The TestBot is now quite good at identifying the new failures so on good days that's fast. On bad days there are a lot of reports to look at.
That's the most efficient way to get a list of new failures, but only for those happening _in the TestBot_.
I usually try to file a bug as soon as possible so I can update the failures page and be sure the TestBot will not report the failure as new again.
Also the TestBot automatically identifies unchanging failure messages and does not report them as new on the following days. That can lead on to think a failure was a one-off when in fact it is happening systematically.
- I also scan the last job of all MRs to identify which failures were present (and update failures-mr.txt in the process): those are the failures that are not considered to be new (otherwise the MR should not have been merged). When it's all green this is obviously fast but otherwise it requires looking at all the logs. If a failure happens only once it may not be worth reporting. But in failures-mr.txt I can see which ones are most common and I try to report those first.
This also allows me to identify failures that only happen in the GitLab CI and not in full WineTest runs.
- And from time to time I just go through the patterns page to identify non-TestBot, non-GitLab CI new failures such as those that happen on Remi's boxes or mine.
Scanning the pattern page takes more time so I don't do it as often.
* That's all for reporting new failures but sometimes failures get fixed without the bug being closed (which is quite understandable: the developer may just not be aware of the bug).
That does not have much of a negative impact on the TestBot so I give closing bug a much lower priority (it can artificially inflate the failure modes number on the patterns page though).
Closing these mostly involves looking at the TestBot's failures page and checking those that have not been matched in a while (or ever).
There's also a more interesting reason to look at those: identify the entries where I got the regexps wrong so the failures may still be reported as new (which I should notice on the failures-winetest.txt but only if the TestBot does not already identify it as old).
* Finally from time to time, less than once a month, go through the failures page to identify the entries that are not needed because the failure has been fixed.
Is that sustainable (and considering other tasks on your plate)?
When I have to focus on other things I generally have to stop looking at the tests for a while. So it's not totally sustainable.