On Sat, 1 May 2021, Zebediah Figura (she/her) wrote: [...]
Looks like a more sophisticated version of https://www.winehq.org/~jwhite/2deb8c2825af.html, which is definitely a nice resource when I'm trying to put effort into fixing test failures.
Right. I should probably have mentioned this bug which says Jer's page was part of the inspiration. But that page did not do what I need so I tweaked it.
https://bugs.winehq.org/show_bug.cgi?id=48164
Oh. And now the official pages are online and getting more feature complete.
https://test.winehq.org/data/patterns-tb-win.html https://test.winehq.org/data/patterns-tb-wine.html
I guess the tests are color-coded by number of failures, modulo some constant?
Right. Each failure type (timeout, crash, etc) has its own color. And then I use a gradient to attribute a color to each 'vanilla' failure count.
Note that what counts for allocating the colors is not the actual failure counts, but the number of different failure counts. That is a test with 4, 5 or 6 failures will get the same colors as one with 1, 2 or 100 failures because in both cases there are only 3 different values.
I'll add a description of the patterns on the pages at some point.
I like the idea. I will note though that some of those colours seem hard to tell apart, e.g. the shades of green in wine d3d9:device.
Yes. When a test unit has 30 different failure counts it's hard to find enough easy to distinguish colors. It's probably possible to do better by tweaking the colors the gradient goes through.
https://source.winehq.org/git/tools.git/blob/HEAD:/winetest/build-patterns#l...
The cyan-green-yellow part of the gradient produces colors that are not very easy to distinguish. The colors in the yellow-red part seem easier to identify but that gradient is given the same weight as the other two. I've experimented a bit with a darker cyan but going too dark does not look very nice.
Also I guess they aren't consistent across tests for some reason?
The goal is to maximize the contrast in the colors used by each pattern. But if I used a single 'color map' for all test units, I would need to allocate a hundred different colors. Then many test units with just a few failures would end up only using very similar colors.
Allocating one color map per test unit limits this issue to just a few patterns. And the best fix would be to reduce the number of failures in these tests ;-)