From: Reece Dunn msclrhd@googlemail.com
The Wine tests are now passing on a Windows XP machine
(http://test.winehq.org/data/4b27dfec939d131c9d7e09f97f14dfc7dabe8843/#group _XP).
Actually, that was not the first time. On 20 Jan an XP machine passed: http://test.winehq.org/data/e9d8c9f572998054b1f9c386ea81a3570c65f2d2/#group_ XP And on 27 Jan a 2003 machine: http://test.winehq.org/data/8f829034f3fe4da3e7adce2f4685e10ba2e2fe82/#group_ 2003
Two of the other four only have 2 or 3 failures.
The 2003 group has 3 machines with 1 failure each (urlmon:protocol on one, user32:menu on the others).
I've sent a patch for the user32:menu failure a couple of weeks back but it got blackholed http://www.winehq.org/pipermail/wine-patches/2009-January/067483.html
I won't be submitting test results next week but don't worry, I'll be back...
Ge.
2009/1/30 Ge van Geldorp ge@gse.nl:
From: Reece Dunn msclrhd@googlemail.com
The Wine tests are now passing on a Windows XP machine
(http://test.winehq.org/data/4b27dfec939d131c9d7e09f97f14dfc7dabe8843/#group _XP).
Actually, that was not the first time. On 20 Jan an XP machine passed: http://test.winehq.org/data/e9d8c9f572998054b1f9c386ea81a3570c65f2d2/#group_ XP And on 27 Jan a 2003 machine: http://test.winehq.org/data/8f829034f3fe4da3e7adce2f4685e10ba2e2fe82/#group_ 2003
Nice! Paul Vriens already mentioned this.
Quite a few of the XP and 2003 machines have 1-3 failures, so this should mean that we get more Windows machines running at 0 failures. It will also mean that we could potentially use those machines to test the inconsistencies in the test runs, like Dan did with the Wine runs when setting up patchwatcher.
This could be used to see exactly what fails when you change the DPI settings, for example. Or use a different theme. Or no theme. And other things that the user could change, doing them one at a time to get a clear "this is what changes when you do X".
I suspect that the user32:sysparams tests would be complex to fix.
Two of the other four only have 2 or 3 failures.
The 2003 group has 3 machines with 1 failure each (urlmon:protocol on one, user32:menu on the others).
I've sent a patch for the user32:menu failure a couple of weeks back but it got blackholed http://www.winehq.org/pipermail/wine-patches/2009-January/067483.html
Do you know why you need to do a SetForegroundWindow there to make the tests succeed? Why is the input being sent to a different window? Why does it vary from platform-to-platform and machine-to-machine?
When you understand that, you can then either give that evidence in the resubmit - which would give the patch a better chance at being accepted - or use it to improve the test another way that is more correct (if that is what the investigation shows). For example, are the tests failing because of DPI settings? Or has Windows changed the menu sizes such that the coordinates are wrong?
I won't be submitting test results next week but don't worry, I'll be back...
No problem.
- Reece
From: Reece Dunn [mailto:msclrhd@googlemail.com]
Do you know why you need to do a SetForegroundWindow there to make the tests succeed? Why is the input being sent to a different window? Why does it vary from platform-to-platform and machine-to-machine?
It's a timing issue. It doesn't just vary from machine-to-machine but even from testrun-to-testrun (e.g. gvg-wxpprosp2 failed on 29 Jan but passed on 27 Jan). This is on VMs which get reset to a snapshot before running the tests, so the machine starts each test run identically.
Gé.
Ge van Geldorp wrote:
From: Reece Dunn [mailto:msclrhd@googlemail.com]
Do you know why you need to do a SetForegroundWindow there to make the tests succeed? Why is the input being sent to a different window? Why does it vary from platform-to-platform and machine-to-machine?
It's a timing issue. It doesn't just vary from machine-to-machine but even from testrun-to-testrun (e.g. gvg-wxpprosp2 failed on 29 Jan but passed on 27 Jan). This is on VMs which get reset to a snapshot before running the tests, so the machine starts each test run identically.
Gé.
I think the main problem is that the thread might run parallel with some other test code.
That is why I would suggest to wait for the thread to finish at the end of the test.
I made a patch and attached it to this email. I can't reproduce the failure on my system, so I can't test if the patch helps. It would be cool, if someone could test, if my patch fix the issue.
Best regards, Florian Köberle
Hi Florian,
From: Florian Köberle [mailto:florian@fkoeberle.de] I think the main problem is that the thread might run parallel with some other test code.
That is why I would suggest to wait for the thread to finish at the end of the test.
I made a patch and attached it to this email. I can't reproduce the failure on my system, so I can't test if the patch helps. It would be cool, if someone could test, if my patch fix the issue.
I applied your patch and ran the test about a dozen times, two runs had failures. So I'm afraid your patch is not enough.
Gé.
Hello Gé.
I think the main problem is that the thread might run parallel with some other test code.
That is why I would suggest to wait for the thread to finish at the end of the test.
I made a patch and attached it to this email. I can't reproduce the failure on my system, so I can't test if the patch helps. It would be cool, if someone could test, if my patch fix the issue.
I applied your patch and ran the test about a dozen times, two runs had failures. So I'm afraid your patch is not enough.
Gé.
How many cores do your processor have? I have a dual core which might make the difference.
How exactly do you execute the tests? I executed "../../../tools/runtest -q menu" about 20 times without any failure.
Did my patch reduce the number of failures? I am asking as I wonder, if I should send the patch anyway to wine-patches or not.
Does removing the Sleep(500) increase the fail rate? I tried to remove it but it had no effect for me at all.
Best regards, Florian Köberle
Florian,
From: Florian Köberle [mailto:florian@fkoeberle.de] How many cores do your processor have? I have a dual core which might make the difference.
I ran inside a VM that had 1 CPU assigned, so single core.
How exactly do you execute the tests? I executed "../../../tools/runtest -q menu" about 20 times without any failure.
I copied user32_crosstest.exe to C:\winetest, then created a small batch file runtest.cmd:
@echo off cd \winetest user32_crosstest menu pause
Put a shortcut to runtest.cmd on the desktop and then set the shortcut to run Minimized (during my regular test runs the console is minimized too, so I thought I'd mimic that as much as possible). Then double click the shortcut, wait until the test is done, restore the console window and look at the results.
Did my patch reduce the number of failures? I am asking as I wonder, if I should send the patch anyway to wine-patches or not.
It's hard to say because it doesn't reproduce consistently. My feeling is that it doesn't reduce the number of failures, but no hard data to back that up.
Does removing the Sleep(500) increase the fail rate? I tried to remove it but it had no effect for me at all.
No, I don't see an effect either.
I've done some more experimenting and can get the problem to reproduce more consistently (like 95% of the time instead of 25%) by launching my shortcut twice (so I have two test runs going at the same time). Then both of the runs will usually show problems.
Gé.
Hello Gé.
I ran the test 30 times using the same way you did without any failure. I used a vm with 1 CPU assigned too. Installed is a Windows XP with SP2.
If I start the test twice so that two tests are running parallel then I get test failures too, but:
Do we really have the requirement that tests must be able to run in parallel?
Best regards, Florian
Ge van Geldorp wrote:
Florian,
From: Florian Köberle [mailto:florian@fkoeberle.de] How many cores do your processor have? I have a dual core which might make the difference.
I ran inside a VM that had 1 CPU assigned, so single core.
How exactly do you execute the tests? I executed "../../../tools/runtest -q menu" about 20 times without any failure.
I copied user32_crosstest.exe to C:\winetest, then created a small batch file runtest.cmd:
@echo off cd \winetest user32_crosstest menu pause
Put a shortcut to runtest.cmd on the desktop and then set the shortcut to run Minimized (during my regular test runs the console is minimized too, so I thought I'd mimic that as much as possible). Then double click the shortcut, wait until the test is done, restore the console window and look at the results.
Did my patch reduce the number of failures? I am asking as I wonder, if I should send the patch anyway to wine-patches or not.
It's hard to say because it doesn't reproduce consistently. My feeling is that it doesn't reduce the number of failures, but no hard data to back that up.
Does removing the Sleep(500) increase the fail rate? I tried to remove it but it had no effect for me at all.
No, I don't see an effect either.
I've done some more experimenting and can get the problem to reproduce more consistently (like 95% of the time instead of 25%) by launching my shortcut twice (so I have two test runs going at the same time). Then both of the runs will usually show problems.
Gé.
From: Florian Köberle [mailto:florian@fkoeberle.de]
I ran the test 30 times using the same way you did without any failure.I used a vm with 1 CPU assigned too. Installed is a Windows XP with SP2.
If I start the test twice so that two tests are running parallel then I get test failures too, but:
Do we really have the requirement that tests must be able to run in parallel?
Sorry about late reply, I've been away. No, I don't think we have the requirement that tests must be able to run in parallel. The only reason I brought that up is because that seems to be a good way to make the failure consistently reproduceable.
Ge.
From: Ge van Geldorp [mailto:ge@gse.nl]
I've done some more experimenting and can get the problem to reproduce more consistently (like 95% of the time instead of 25%) by launching my shortcut twice (so I have two test runs going at the same time). Then both of the runs will usually show problems.
BTW, this causes failures also with my own fix applied (althoug only a single failure instead of a bunch), so my own fix wasn't correct either.
Gé.
From: Ge van Geldorp [mailto:ge@gse.nl] On Behalf Of 'Florian Köberle'
Sorry, I messed up the headers. The previous message containing:
BTW, this causes failures also with my own fix applied (althoug only a single failure instead of a bunch), so my own fix wasn't correct either.
was sent by me, not by Florian.
Gé
Reece Dunn wrote:
2009/1/30 Ge van Geldorp ge@gse.nl:
From: Reece Dunn msclrhd@googlemail.com
The Wine tests are now passing on a Windows XP machine
(http://test.winehq.org/data/4b27dfec939d131c9d7e09f97f14dfc7dabe8843/#group _XP).
Actually, that was not the first time. On 20 Jan an XP machine passed: http://test.winehq.org/data/e9d8c9f572998054b1f9c386ea81a3570c65f2d2/#group_ XP And on 27 Jan a 2003 machine: http://test.winehq.org/data/8f829034f3fe4da3e7adce2f4685e10ba2e2fe82/#group_ 2003
Nice! Paul Vriens already mentioned this.
Quite a few of the XP and 2003 machines have 1-3 failures, so this should mean that we get more Windows machines running at 0 failures. It will also mean that we could potentially use those machines to test the inconsistencies in the test runs, like Dan did with the Wine runs when setting up patchwatcher.
This could be used to see exactly what fails when you change the DPI settings, for example. Or use a different theme. Or no theme. And other things that the user could change, doing them one at a time to get a clear "this is what changes when you do X".
See attached screenshot of what I use for my own testing purpose. This of course works well because I have only 1 system of each platform.