On Mon Aug 7 19:03:50 2023 +0000, Esme Povirk wrote:
Here's another alternative I'd be OK with: If any mismatch is found, calculate the common prefix and common suffix. Dump the lengths of those, along with the actual messages in between, and don't try to compare anything in the middle. For only the common messages (prefix/suffix or just the full thing), do a more detailed comparison. This would be a step towards LCS, but it wouldn't have nearly the complexity.
My idea was that with full sequences printed out, and because you won't get all the possible sequences at once, you could maybe gather them across the various runs, check and compare the failing sequences to figure some common patterns. Having only partial diff, it makes it more difficult.
I don't have a strong opinion there, I think these messages tests have a deeper flaw anyway, as for instance there's way too many optional messages. Some sequences are pretty much entirely optional with only a few percent of actually expected messages, I don't see how this can verify anything useful. Imo the data-driven tests here are showing their limitation (or the data description we use is too limited).
I'm not really working on that area so I'll leave this for @julliard to decide. I think your proposal is fine, and I was only trying to explore a different approach.