I wrote a script to analyze winetestbot results on all of the testbot vms except the win8 vms (they are just too broken to try to analyze right now).
I'm trying to get a handle on the nature of the bot failures; this current script looks for consistent failures (partly because a consistent failure that goes green is a win, and I badly want to track wins).
My results are here: http://www.winehq.org/~jwhite/ecd24b5a874e.html
The short summary is that we have 38 current failures. But we appear to have fixed 3 failures (at least on the testbot vms). We also have 14 that seem consistent (i.e. that should be more tractable).
I intend to use this to track progress and show status. This result is arguably not that interesting; I'm mostly posting it now so I can be publicly shamed if I fail to follow through.
Cheers,
Jeremy
On 3/10/2014 06:13, Jeremy White wrote:
msxml3:saxreader.html http://test.winehq.org/data/tests/msxml3:saxreader.html xp_newtb-wxppro(0 http://test.winehq.org/data/ecd24b5a874ead368c8f6e9d6981bb0e02472f9d/xp_newtb-wxppro/msxml3:saxreader.html,1 http://test.winehq.org/data/630e8d92578b347d6e94db097c05572bb416bb2e/xp_newtb-wxppro/msxml3:saxreader.html,2 http://test.winehq.org/data/49f3b4282d29890fe3c213d096571af4de2c45cd/xp_newtb-wxppro/msxml3:saxreader.html,3 http://test.winehq.org/data/9c5c3a81ceab5362513de6eb81cee921dcc52c14/xp_newtb-wxppro/msxml3:saxreader.html,4 http://test.winehq.org/data/049f08f4cda090189ae57d4ba58906d891ac3d4c/xp_newtb-wxppro/msxml3:saxreader.html,5 http://test.winehq.org/data/376953e00a97cc6ff5e18e8a8e0cd7fb70b15629/xp_newtb-wxppro/msxml3:saxreader.html,6 http://test.winehq.org/data/0eb626587b2f75e9904ba827eec1cd8a7f5789a2/xp_newtb-wxppro/msxml3:saxreader.html,7 http://test.winehq.org/data/fcae01672f2d480597a40850ff0386268b24791d/xp_newtb-wxppro/msxml3:saxreader.html,8 http://test.winehq.org/data/ccd8daf0f8564949e0811decf6a110b95be1a57a/xp_newtb-wxppro/msxml3:saxreader.html,9 http://test.winehq.org/data/37e0a1a5d4977a5f017709109dd6cf7a948b78e8/xp_newtb-wxppro/msxml3:saxreader.html) 2000_newtb-w2000pro(1 http://test.winehq.org/data/630e8d92578b347d6e94db097c05572bb416bb2e/2000_newtb-w2000pro/msxml3:saxreader.html,2 http://test.winehq.org/data/49f3b4282d29890fe3c213d096571af4de2c45cd/2000_newtb-w2000pro/msxml3:saxreader.html,4 http://test.winehq.org/data/049f08f4cda090189ae57d4ba58906d891ac3d4c/2000_newtb-w2000pro/msxml3:saxreader.html,5 http://test.winehq.org/data/376953e00a97cc6ff5e18e8a8e0cd7fb70b15629/2000_newtb-w2000pro/msxml3:saxreader.html,7 http://test.winehq.org/data/fcae01672f2d480597a40850ff0386268b24791d/2000_newtb-w2000pro/msxml3:saxreader.html,9 http://test.winehq.org/data/37e0a1a5d4977a5f017709109dd6cf7a948b78e8/2000_newtb-w2000pro/msxml3:saxreader.html)
Here's a can of green paint for this:
http://www.winehq.org/pipermail/wine-patches/2014-March/131010.html
On 3/10/2014 06:13, Jeremy White wrote:
Regarding Mac failures, it looks like this problem:
http://test.winehq.org/data/ecd24b5a874ead368c8f6e9d6981bb0e02472f9d/mac_fg-...
is about missing/old libxslt
On Mon, 10 Mar 2014, Nikolay Sivov wrote:
On 3/10/2014 06:13, Jeremy White wrote:
Regarding Mac failures, it looks like this problem:
http://test.winehq.org/data/ecd24b5a874ead368c8f6e9d6981bb0e02472f9d/mac_fg-...
is about missing/old libxslt
What's interesting is that the macdrv tests do not run into this issue.
The difference is that for the macdrv tests I set DYLD_FALLBACK_LIBRARY_PATH="/opt/local/lib" which causes them to use what I believe to be the MacPorts libxslt.dylib library.
For the x11drv tests I set DYLD_FALLBACK_LIBRARY_PATH="/opt/X11/lib" because the libX11.dylib library does not (or did not) play well with XQuartz. As a result the X11 tests use /usr/lib/libxslt.dylib which is the library shipped with Snow Leopard.
I'll retry running the x11drv tests with /opt/local/lib.
However given that MacPorts is not part of Mac OS X that raises the question of what sort of system tweaks make sense in order to run the Wine conformance tests without error.
On 3/10/2014 19:10, Francois Gouget wrote:
On Mon, 10 Mar 2014, Nikolay Sivov wrote:
On 3/10/2014 06:13, Jeremy White wrote:
Regarding Mac failures, it looks like this problem:
http://test.winehq.org/data/ecd24b5a874ead368c8f6e9d6981bb0e02472f9d/mac_fg-...
is about missing/old libxslt
What's interesting is that the macdrv tests do not run into this issue.
The difference is that for the macdrv tests I set DYLD_FALLBACK_LIBRARY_PATH="/opt/local/lib" which causes them to use what I believe to be the MacPorts libxslt.dylib library.
For the x11drv tests I set DYLD_FALLBACK_LIBRARY_PATH="/opt/X11/lib" because the libX11.dylib library does not (or did not) play well with XQuartz. As a result the X11 tests use /usr/lib/libxslt.dylib which is the library shipped with Snow Leopard.
I'll retry running the x11drv tests with /opt/local/lib.
However given that MacPorts is not part of Mac OS X that raises the question of what sort of system tweaks make sense in order to run the Wine conformance tests without error.
So it failed while running with system shipped version? I think we should use a version that we expect users to use, and I don't know which one is that. If system version is old enough to cause troubles then we should be using something more up-to-date from MacPorts. System lib is unlikely to be updated with system update, right? Especially for system that are not supported anymore (not sure if Snow Leopard still gets updates).
If it's decided to use system lib no matter what then test should be improved to skip in such cases.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2014-03-10 03:13, schrieb Jeremy White:
- ddraw:ddraw4.html 2000_newtb-w2000pro(7,9)
- ddraw:ddraw7.html 2000_newtb-w2000pro(7,9) win7_newtb-w7u(7,8,9)
Some of the lines in them are fairly easy to fix, I'll send a patch.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2014-03-10 11:04, schrieb Stefan Dösinger:
Am 2014-03-10 03:13, schrieb Jeremy White:
- ddraw:ddraw4.html 2000_newtb-w2000pro(7,9) * ddraw:ddraw7.html
2000_newtb-w2000pro(7,9) win7_newtb-w7u(7,8,9)
Some of the lines in them are fairly easy to fix, I'll send a patch.
Actually the machines I looked at weren't from the testbot at all, but fgouget's VMware machine. Still at least one failure on them should be fixable.
Francois, can you test the attached patch on your fg-win7u64-* VM? It doesn't matter which one, except for the 0sp one, which seems to time out. This patch should fix this kind of error in ddraw4 and ddraw7:
ddraw4.c:4190: Test failed: Got unexpected hr 0x88760104 for format D3DFMT_YUY2, resource type videomemory overlay, size 1x1, expected 0.
There are still other failures left in those tests, so they still won't pass. The error count should be considerably lower though.
On Mon, 10 Mar 2014, Stefan Dösinger wrote: [...]
Francois, can you test the attached patch on your fg-win7u64-* VM? It doesn't matter which one, except for the 0sp one, which seems to time out. This patch should fix this kind of error in ddraw4 and ddraw7:
ddraw4.c:4190: Test failed: Got unexpected hr 0x88760104 for format D3DFMT_YUY2, resource type videomemory overlay, size 1x1, expected 0.
Yep. I tested it in fg-win7u64-1spie9 and the patch fixes these failures. It leaves 16 and 18 test failures for ddraw4 and ddraw7 respectively, which is a big improvement over the 143 and 145 the VM had before.
On 10 March 2014 03:13, Jeremy White jwhite@codeweavers.com wrote:
I wrote a script to analyze winetestbot results on all of the testbot vms except the win8 vms (they are just too broken to try to analyze right now).
I'm trying to get a handle on the nature of the bot failures; this current script looks for consistent failures (partly because a consistent failure that goes green is a win, and I badly want to track wins).
My results are here: http://www.winehq.org/~jwhite/ecd24b5a874e.html
Note that e.g. the win2000 testbot doesn't have results for all runs. It looks like this causes the script to classify some failures that should be "fixed" as intermittent failures. That might in turn cause someone to draw wrong conclusions about e.g. the ddraw tests, if they didn't pay enough attention to wine-patches.
On 03/10/2014 06:08 AM, Henri Verbeet wrote:
On 10 March 2014 03:13, Jeremy White jwhite@codeweavers.com wrote:
I wrote a script to analyze winetestbot results on all of the testbot vms except the win8 vms (they are just too broken to try to analyze right now).
I'm trying to get a handle on the nature of the bot failures; this current script looks for consistent failures (partly because a consistent failure that goes green is a win, and I badly want to track wins).
My results are here: http://www.winehq.org/~jwhite/ecd24b5a874e.html
Note that e.g. the win2000 testbot doesn't have results for all runs. It looks like this causes the script to classify some failures that should be "fixed" as intermittent failures. That might in turn cause someone to draw wrong conclusions about e.g. the ddraw tests, if they didn't pay enough attention to wine-patches.
Yes; not just win2k, but the win7u bot is unreliable, and one of the other win7 bots and one of the vista bots have a few drop outs as well.
But my code, in theory, skips holes in the data, so long as the data stays in line.
In other words, a pattern like this: SS-FFFF-FF where S is success, F is failure, and - is missing data, is considered 'fixed'. A pattern like this: F-F--F-F-F is considered 'consistently failing'. All other patterns are considered intermittent.
Note that it's only against the newtb vms; so you'll see the claim that d3d9:stateblock is fixed, but there is one non newtb machine where it still fails.
Cheers,
Jeremy
On 10 March 2014 13:26, Jeremy White jwhite@codeweavers.com wrote:
But my code, in theory, skips holes in the data, so long as the data stays in line.
In other words, a pattern like this: SS-FFFF-FF where S is success, F is failure, and - is missing data, is considered 'fixed'. A pattern like this: F-F--F-F-F is considered 'consistently failing'. All other patterns are considered intermittent.
The data for ddraw7 on Windows 2000 for example is "-SS-SS-F-F".
On 03/10/2014 07:55 AM, Henri Verbeet wrote:
On 10 March 2014 13:26, Jeremy White jwhite@codeweavers.com wrote:
But my code, in theory, skips holes in the data, so long as the data stays in line.
In other words, a pattern like this: SS-FFFF-FF where S is success, F is failure, and - is missing data, is considered 'fixed'. A pattern like this: F-F--F-F-F is considered 'consistently failing'. All other patterns are considered intermittent.
The data for ddraw7 on Windows 2000 for example is "-SS-SS-F-F".
Yeah, I explained it incorrectly (and the code is rough, and quite possibly wrong). A pattern requires exact edges to be considered fixed; so 'SS-FFFF-FF' would be considered indeterminate. 'SSF-FFF-FF' would be considered fixed.
I'll see about tweaking that (I changed it to prevent '-FFFFFFFFF' from being considered 'fixed' <grin>. But I can fix that a different way).
Cheers,
Jeremy
I'll see about tweaking that (I changed it to prevent '-FFFFFFFFF' from being considered 'fixed' <grin>. But I can fix that a different way).
I've updated it.
http://www.winehq.org/~jwhite/ecd24b5a874e.html
We now have 10 purported fixes! Woohoo!
I've also added annotations, which was the main feature I was planning, so I could work the list more intelligently. I've hopefully added in the relevant notes from Francois.
My main focus is on the test bot vms, as those match the automated patch screen. (Doesn't mean we shouldn't fix some of the other interesting tests; I'm just trying to limit the scope).
Cheers,
Jeremy
I've updated it slightly to show the S/-/F indicators, and run it against the latest Wine: http://www.winehq.org/~jwhite/770213e16c69.html
The good news is that Nikolay has, apparently, painted msxml3:saxreader a nice shade of green.
A few other tests are now being considered intermittent (advapi:eventlog, urlmon:url, and comdlg32:filedlg). That's not a material change; mostly just highlights flaws in the previous analysis.
So, 1 down, 32 to go...
Cheers,
Jeremy
* ntdll:exception (All QEmu VMs) These are caused by a known QEmu bug. That bug got fixed^H^H^Hreplaced by another bug in 1.7.0. See: https://bugs.launchpad.net/qemu/+bug/1119686
Also a test may have multiple independent failures. So it's important to look at the individual test failures and gather clues from the tests they fail on.
For instance: user32:msg.html win7_newtb-w7u(0,1,2,7,8,9) xp_newtb-wxppro(4) - There are 49 failures on the newtb-w7u VM and under a dozen on the other TestBot VMs. The reason is that the newtb-w7u VM is set up with a Japanse locale which causes extra failures. Note that one can get more information about a given VM by clicking on the 'info' link in the test results. Here: http://test.winehq.org/data/ecd24b5a874ead368c8f6e9d6981bb0e02472f9d/win7_ne...
These extra user32:msg test failures in Japanese locales have been documented: http://bugs.winehq.org/show_bug.cgi?id=35611
- The same test also exhibits specific errors in the Hebrew locale. We don't have Hebrew or other LTR VMs in the TestBot but my Windows 7 VM has one such test configuration: fg-win7u64-he. http://bugs.winehq.org/show_bug.cgi?id=35610
- The remaining Windows 7 VMs have a group of 3 'region' failures, and another unrelated group of 3 'message 31f' failures which appears to be somewhat random (or at least does not affect all VMs).
- My Windows XP VMs have a totally unrelated set of 3 'minimum timeout' failures, which sometimes appear on Windows 2003 and Windows 8. http://bugs.winehq.org/show_bug.cgi?id=34915
So as one can see things can be quite complex under the hood.
Here are some test failures I looked into and tried to diagnose a bit (all VMs, not just the WineTestBot ones):
* Bug 35573 - gdi32:fonts test_stock_fonts() fails on Windows 7 in the Japanese and Hebrew locales http://bugs.winehq.org/show_bug.cgi?id=35573
* Bug 35760 - gdi32:font test_fullname2() fails on Windows 7 in the French locale http://bugs.winehq.org/show_bug.cgi?id=35760
* Bug 33720 - user32:menu This one is intermittent. http://bugs.winehq.org/show_bug.cgi?id=33720
* Bug 33718 - comctl32:propsheet Add button test failure http://bugs.winehq.org/show_bug.cgi?id=33718
* Bug 33719 - comctl32:propsheet custom window proc test failure http://bugs.winehq.org/show_bug.cgi?id=33719
And for some Windows 8 issues:
* Bug 35575 - gdi32:font Windows 8.1 failures http://bugs.winehq.org/show_bug.cgi?id=35575
* Bug 34830 - rpcrt4:cstub fails and crashes on Windows 8 http://bugs.winehq.org/show_bug.cgi?id=34830
* Bug 34829 - wintrust:softpub crashes on Windows 8 http://bugs.winehq.org/show_bug.cgi?id=34829