On Fri, 2 Sep 2022, Huw Davies wrote: [...]
The issue is that we don't know that the tests are flakey when they get committed; that only becomes apparent at some later stage.
Whatever CI is used it should detect most flaky tests before they are committed. That's part of the point of running the test not once but multiple times (the other being to test multiple configurations).