We switched to the new WineTestBot near the start of September. Early on it ran into a problem when it was flooded with tests due to the __WINESRC__ tests work and it failed to keep up.
The reason is that as time went on the time it takes to get the VMs back to a clean state ready to run the tests (the revert time) kept increasing as can be seen on this graph:
http://fgouget.free.fr/wtb/vm1-revtimes.png
There one can see the increased activity in september that what was caused by the surge in test patches. But the most remarkable feature is the Windows XP revert times which increase linearly, reaching over 30 minutes. The revert times for the other VMs are much more random but still seem to increase somewhat (1).
The reason for the increase is still unknown. The revert time fell down a few times and now I suspect this corresponds to times when the VM was restored from backup.
So at the start of October I upgraded to QEMU 1.6.0 and restored all the VMs from backup, hoping this would solve the problem. The revert times did get much better and the WineTestBot now manages to keep up:
http://fgouget.free.fr/wtb/vm1-revtimes-new.png
The upgrade happened around the 2nd of october and the revert times fell at that point. But looking at more recent data it's clear that the Windows XP revert times are increasing again, while those of the other VMs remain stable (see vm1-revtimes.xls for the raw data).
That XP VM is one of the first ones I created, so it probably was created with a pretty old QEMU version. Futhermore it was only an out of the box SP2 and did not have any of the later updates. So I decided to recreate it from scratch, using QEMU 1.6.0, and to add all the other updates as I was at it. And I'm hoping that this time things will be better. That's why this VM is currently in 'maintenance' mode.
I also took down the 32-bit Windows 8 VM for maintenance. It too is missing proper updates and the plan is to upgrade it to Windows 8.1.
Another issue has been that the tests are running a bit slow, a particular point of pain was msi:action and msi:install which were regularly timing out. See:
http://fgouget.free.fr/wtb/runtimes.htm
So while rebuilding the Windows XP VM I did some performance tests. That paid off. I found that switching the emulated disk cache mode from writeback to default roughly halves the msi run times:
http://fgouget.free.fr/wtb/runtimes2.htm
msi:action and msi:install went from an average of 99 and 90 seconds down to 48 and 39 seconds and they are not timing out anymore. The remaining troublesome tests are now ddraw:ddraw{2,4,7}, shell32:shlexec, and winspool.drv:info.
I also uncovered two other potential performance issues: * Some VMs have Windows Defender and apparently it was starting a full disk scan as soon as the VM was ready to run the tests. That's clearly not good so I disabled Windows Defender in all VMs.
* Most VMs were still using 20% of the CPU when idle. That's fine if only one VM is running but could impact performance (through increased interrupt rates) if many are running at once (like on the WineTestBot). Apparently the culprit is the USB Tablet device that's added by default. So I removed it from all VMs too (and now the mouse is even more laggy than before when I work on the VMs :-( But that does not impact the tests).
So there is still a lot of work to do on the WineTestBot, and that's not even counting fixing all the failing tests.
(1) It's almost as if the increasing Windows XP revert times were pulling up the other VM revert times. Yet there's only one revert at a given time, so I don't see how this would be possible.
(2) Setting the cache mode to writeback was improving the performance quite a bit with the pre-upgrade QEMU version. But obviously it's not true anymore. Another interesting point is that the Virtio mode improves performance by a factor of 10 (down from 450 seconds to 45 in msi:action). Yet Windows 8 posts the fastest times yet in the basic IDE mode (~23 seconds). I couldn't get Windows 8 working with the Virtio disk yet and I have not verified that it's really doing the same set of tests. So that remains to be confirmed.
Here is some performance data taken from the Windows XP VM while the TestBot was idle. From it cache=default looks like it's a completely separate mode!
msi:action | msi:install | msi:msi 448 460 520| 487 439 | 238 229 VGA, qcow2 ide disk, cache:default 44 45 42 | 47 49 46 | 24 23 21 VGA, qcow2 virtio disk, cache:default 48 45 50 | 47 48 48 | 23 23 24 VMVGA/VGA, qcow2 virtio disk, cache:default 47 46 | 48 46 | 23 22 22 VMVGA, qcow2 virtio disk, cache:default 47 47 | 49 48 | 24 23 VMVGA, raw virtio disk, cache:default 62 67 67 | 64 60 63 | 29 27 27 VMVGA, raw virtio disk, cache:none 111 92 72 113 | 101 101 | 37 37 40 VMVGA, raw virtio disk, cache:writethrough 61 64 53 | 70 59 58 | 36 28 29 VMVGA, raw virtio disk, cache:writeback
46 44 | | VMVGA, raw virtio disk, cache:default 71 55 68 | | VMVGA, raw virtio disk, cache:writeback 65 71 72 69| | VMVGA, raw virtio disk, cache:none 76 118 111 | | VMVGA, raw virtio disk, cache:writethrough 50 48 47 | | VMVGA, raw virtio disk, cache:default
And the preliminary Windows 8 performance. I hope it holds up. msi:action | msi:install | msi:msi 23 22 22 | 16 15 16 | 13 13 15 VGA, qcow2 ide disk, cache:default
This is great, thanks!
msi:action and msi:install went from an average of 99 and 90 seconds down to 48 and 39 seconds and they are not timing out anymore. The remaining troublesome tests are now ddraw:ddraw{2,4,7}, shell32:shlexec, and winspool.drv:info.
Just summarizing: we have a hoped for fix for the XP revert time issue; if that holds, then we have those troublesome tests, and fixing failures to go. Is that about the size of it?
Cheers,
Jeremy
And I meant to ask, but did not: does it make sense to throw hardware at some of the issues? We've got additional systems we could easily rig as slaves...
Cheers,
Jeremy
On Sun, 27 Oct 2013, Jeremy White wrote:
And I meant to ask, but did not: does it make sense to throw hardware at some of the issues? We've got additional systems we could easily rig as slaves...
The current VM host is far from fully utilized. The CPU is over 90% idle. The only point that worries me is the I/O but I hope disabling Windows Defender will have greatly reduced it.
To get good performance we need to: * Solve the VM revert issue.
* Tweak the self-imposed limits on active VMs and reverts. Currently we only allow up to 2 active VMs and 1 revert, that's 2 running VMs or 1 running VM and one revert. Given that the bos has 8 cores and each VM normally uses at most 2, I think we can go up to 4 running VMs. We can probably also allow up to 2 reverts, though that's likely more disk-limited. I think we can also reduce the post-revert sleep phase (from 30 seconds down to 10 or less).
* Defuse 'test-bombs', though these are mostly a corner case really (once the __WINESRC__ work is done it will be some time before we get such patches again). But there's other reasons to do that work anyway (bug 33065).
On Mon, Oct 28, 2013 at 12:55 AM, Francois Gouget fgouget@codeweavers.comwrote:
On Sun, 27 Oct 2013, Jeremy White wrote:
And I meant to ask, but did not: does it make sense to throw hardware at some of the issues? We've got additional systems we could easily rig as slaves...
The current VM host is far from fully utilized. The CPU is over 90% idle. The only point that worries me is the I/O but I hope disabling Windows Defender will have greatly reduced it.
If we're I/O bound we can still throw hardware at the problem. Use SSDs for storage, a raid array, etc.
Francois Gouget fgouget@codeweavers.com wrote:
I also uncovered two other potential performance issues:
Some VMs have Windows Defender and apparently it was starting a full disk scan as soon as the VM was ready to run the tests. That's clearly not good so I disabled Windows Defender in all VMs.
Most VMs were still using 20% of the CPU when idle. That's fine if only one VM is running but could impact performance (through increased interrupt rates) if many are running at once (like on the WineTestBot). Apparently the culprit is the USB Tablet device that's added by default. So I removed it from all VMs too (and now the mouse is even more laggy than before when I work on the VMs :-( But that does not impact the tests).
So there is still a lot of work to do on the WineTestBot, and that's not even counting fixing all the failing tests.
Yes, there are still ways for the new testbot improvements. When time from time I send patches for testing I send them to both old and new testbots. While old tesbot completes the job in minutes (usually 3-5), the new one almost always completes the request several hours later. Looking at the job status it shows that the build is complete and VMs are online and ready but nothing happens further on.
On Mon, 28 Oct 2013, Dmitry Timoshkov wrote: [...]
Yes, there are still ways for the new testbot improvements. When time from time I send patches for testing I send them to both old and new testbots. While old tesbot completes the job in minutes (usually 3-5), the new one almost always completes the request several hours later.
I think that's because of two factors: * The current __WINESRC__ jobs which I like to call testbot-bombs. That's because these patches tend to touch multiple tests which creates one task per test per base VM, so 8 tasks per test, plus the build one. So for instance a patch fixing the 11 advapi32 tests would create 89 tasks (1 build + 11 tests * 8 VMs), so 89 reverts. There's been many such patches lately and that delays the tests coming after them.
* The old TestBot is not processing wine-patches emails anymore. That means it's essentially has nothing to do but work on your jobs.
Defusing the TestBot bombs will be one of my next TestBot tasks after these two VMs are back online. The way it will work is to build custom WineTest binaries with just the tests we need to run, and to run that binary on each VM. So in the above case we'll only have 1 + 8 tasks.
Looking at the job status it shows that the build is complete and VMs are online and ready but nothing happens further on.
That's strange. It can happen if it's waiting on a VM that's offline or undergoing maintenance. There can also be cases where the build VM gets a bit ahead (but normally the TestBot tries to complete the jobs one at a time).
Francois Gouget fgouget@codeweavers.com wrote:
On Mon, 28 Oct 2013, Dmitry Timoshkov wrote: [...]
Yes, there are still ways for the new testbot improvements. When time from time I send patches for testing I send them to both old and new testbots. While old tesbot completes the job in minutes (usually 3-5), the new one almost always completes the request several hours later.
I think that's because of two factors:
- The current __WINESRC__ jobs which I like to call testbot-bombs. That's because these patches tend to touch multiple tests which creates one task per test per base VM, so 8 tasks per test, plus the build one. So for instance a patch fixing the 11 advapi32 tests would create 89 tasks (1 build + 11 tests * 8 VMs), so 89 reverts. There's been many such patches lately and that delays the tests coming after them.
When I sent a patch to testbot last time there were no __WINESRC__ related patches in the queue...
- The old TestBot is not processing wine-patches emails anymore. That means it's essentially has nothing to do but work on your jobs.
... and nightly test run has been finished as well. Also there were no other things related to wine-patches processing running either.
Looking at the job status it shows that the build is complete and VMs are online and ready but nothing happens further on.
That's strange. It can happen if it's waiting on a VM that's offline or undergoing maintenance. There can also be cases where the build VM gets a bit ahead (but normally the TestBot tries to complete the jobs one at a time).
When next time I'll be playing with the tests what additional info could I provide for better diagnosing the problem?
On Mon, 28 Oct 2013, Dmitry Timoshkov wrote: [...]
When next time I'll be playing with the tests what additional info could I provide for better diagnosing the problem?
I think the simplest would be to save the TestBot's home page and your Job's page as HTML files and send those.
The WineTestBot has a new Windows XP VM. Windows Update keeps breaking which prevented me from installing all the updates I wanted so I'll do some more work on it at some point. Currently it has SP3 and Internet Explorer 7.
I also redid the old Window 8 VM and upgraded it to Windows 8.1 VM which is exciting.
Btw, there are a lot of failures on Windows 8 and 8.1. Most of them also happen on my laptop, i.e. on real hardware (and a Windows 8 set up by Acer, not me):
http://test.winehq.org/data/cb0ef08839e515d7a9053923d27ef8899978a263/index_W...
That means there's a lot of work to get the tests in shape. If you know anything about one of the tests below your help would be greatly appreciated:
advapi32:security (2-4) cmd.exe:batch (1) crypt32:encode (3-6) crypt32:sip (1) d3d10:effect (4, yes same errors on 32/64-bit, VMs and real hardware) kernel32:console (80) kernel32:heap (20) kernel32:locale (65) kernel32:module (8) kernel32:virtual (2) mmdevapi:render (1 error in common) msctf:inputprocessor (1 error in common) mshtml:activex (4) ntdll:path (2) ntdll:pipe (8) ole32:marshall (36-39) quartz:filtergraph (crash) rpcrt4:cstub (crash) shell32:ebrowser (3-4) shell32:shellpath (3-14) shell32:shlexec (13) shlwapi:assoc (1) shlwapi:ordinal (25) urlmon:misc (2) urlmon:protocol (46-71) urlmon:sec_mgr (17-20) urlmon:uri (3) urlmon:url (94) user32:cursoricon (55) user32:edit (4) user32:msg (9-11) user32:sysparams (6-7) user32:win (11) wininet:urlcache (17) winmm:midi (32) wintrust:crypt (2) wintrust:softpub (crash)
On 1 November 2013 02:16, Francois Gouget fgouget@codeweavers.com wrote:
d3d10:effect (4, yes same errors on 32/64-bit, VMs and real hardware)
The effect.c:3956 failure looks like a bug that got fixed, so the existing result should probably be marked broken, and the todo_wine removed. I'm less sure about the other 3, but it's probably ok to just mark them as broken(). For what it's worth, most / all of the tests in d3d10/tests should be hardware independent, so it's expected that you get the same results everywhere as long as you use the same version of Windows.
On 01.11.2013 10:22, Henri Verbeet wrote:
On 1 November 2013 02:16, Francois Gouget fgouget@codeweavers.com wrote:
d3d10:effect (4, yes same errors on 32/64-bit, VMs and real hardware)
The effect.c:3956 failure looks like a bug that got fixed, so the existing result should probably be marked broken, and the todo_wine removed. I'm less sure about the other 3, but it's probably ok to just mark them as broken(). For what it's worth, most / all of the tests in d3d10/tests should be hardware independent, so it's expected that you get the same results everywhere as long as you use the same version of Windows.
Pure speculation: It looks like they return the defaults based on the DepthEnable value. What happens if you set DepthEnable to true? I failed to generate the blob... Henry, how did you generate the effect blob? I somehow only get: "header.fx(54,5): DX9 state 'MipMapLODBias' is not supported in fx_4_0; convert to 'MIPLODBIAS' or use compatibility mode to ignore", I used "fxc.exe /Tfx_4_0 /Zi /Fx temporary.fxx header.fx".
Cheers Rico
On Fri, 1 Nov 2013, Francois Gouget wrote:
The WineTestBot has a new Windows XP VM. Windows Update keeps breaking which prevented me from installing all the updates I wanted so I'll do some more work on it at some point. Currently it has SP3 and Internet Explorer 7.
It turns out that installing SP3 did not break Windows Update. It's just that it takes 12 hours for it to come up with the list of the 131 critical and 16 optional updates to perform. After that the updates install in under 4 hours, Genuine Windows Advantage included.
So the WineTestBot now has a fully up to date Windows XP VM with Internet Explorer 8. And like the Windows 8.1 VM it also has extra components, in this case Windows Search 4, Silverlight, optional Direct X components, msxml4 SP3, Visual C++ 2005 SP1, 2008 SP1 and 2010 SP1 runtimes.
Let me know if you find missing dlls.
Have fun!