On 02/09/2010 12:01 AM, Francois Gouget wrote:
I'm pretty sure if I read the code I would understand this report better but I was not curious enough and unfortunately that means I'm pretty confused now.
You need to read only the SQL used to generate the tables ;)
It seems the data is sliced three ways:
I needed some catchy titles; I know those aren't that great. If somebody has better ideas I'm all ears.
- The Most Popular Messages
- Based on the name it sounds like someone voted on them, but then the definition says they 'are the most prevalent' which sounds like they are just the most frequent ones.
- So I look at the number of lines where they show up and the number one shows up in just 119 lines, while some way down it seems there are some that show up in hundreds of thousands of lines. So I'm lost.
With 'prevalent' I mean the number of files that contain that message not how often those show up in that message. Basically the equivalent of grep -l | wc -l and sorted by that number.
Reason: Wine is too noisy and people tend to ignore the "stray" fixme and err messages here and there. They notice only if a message floods the output. Then it gets silenced to print only once and the message disappears from the radar. This tables tries to show the fixme/err messages that a lot of people see and overlook. As a lot of people see them they are "popular". Like the fixme("stub") messages in DllCanUnloadNow() that got fixed aka removed because of this report; everybody was seeing and ignoring them.
- It also does not help that the order of the Files and Lines column is reversed compared to all the other sections.
I did it to make it visible that the table is sorted by the first column aka "Files". Probably dropping the "Lines" column altogether would have made it clearer.
- Noisy Popular Messages / Functions
- So here the definition says they 'show up at least in 1% of the collected reports', so it sounds like it's the messages/functions that impact the most reports. So it's most frequent per report count.
Not quite. The "The Most Popular Messages" impact the most reports. I came up with this reports after the "The Top Ten Single Charts" turned out to be pretty useless. There are messages that repeat hundreds of thousands of times in *one* single file. And there are no other files out of ~2500 files with that message. While that is annoying for the guy that got that flood, it is statistically irrelevant. Might be his setup; might be a broken commit round, who knows. But those are definitely not worth silencing aka using the if(once++) fixme() trick.
Thus I tried to find fixme/err messages that are noisy (lots of repeated messages in a file) and relevant (1% of the files). I figure Alexandre might be willing to accept the if(once++) fixme() trick for those.
- They are still sorted by Line count though. Shouldn't they be sorted some other way?
I have thought about sorting by the average number of messages per file; I'll look at it tomorrow and see if it makes more sense.
- Is there a difference between a file and a report?
No, not in this context. Though I don't like either of those names: - A file is also an email that has no fixme/err/warn messages. - A report is conflicting with the "Wine FIXME Report". But I couldn't come up with a better name back then. A "log" should be a better name for the files that contain fixme/err/warn messages. But I'm open for other names too.
- The Top Ten Single Charts
- These really look like they are the most frequent by line count (based on the line count). But I have some doubts.
This are the maximum hits per file and not a sum of all hits. Pretty useless and I've thought about dropping this tables as they aren't statistically relevant; more than 50% of those messages show up in only one file.
bye michael