It's been a bad month for the TestBot.
* The first issue was not with the TestBot itself but with cw1-hd6800 which provides the 'real hardware' WineTest results for the AMD HD 6800 graphics card. Its hard drive just died. Newman promptly replaced it and I restored that system from backups (linux + windows).
The good thing that came out of it is that I added the 1809 Windows 10 build to the mix and did so for the cw2-gtx560 system while I was at it. Unfortunately that's pretty much all for nothing right now since Windows 10 1809 has over 70 failures and all the WineTest reports just end up being thrown away :-(
* Then roughly a week later one of the hard drives on vm2 died. vm2 is one of the machines that run the TestBot VMs. That should not have been an issue except the harddrive did not outright die and caused the hardware RAID controller to keep trying to write things to it, tying it up in the process. Eventually the Linux kernel got fed up with the controller building a backlog of writes and turned all filesystems read-only. Things don't work very well after that!
So I proceeded to restore the VMs from backups on the other hosts so the TestBot could work again. Then Newman again promptly replaced the harddrive, the controller slowly rebuilt the array, and I moved the VMs back to vm2. But the TestBot had built quite a backlog by then and it took time for it to catch up.
* One issue is that vm4 was kept pretty busy by the Linux tests: win32 + various locale tests; then wow32 and wow64. So I duplicated the wtbdebian9 VM to vm3 and split the tasks between them: win32 + locales on vm4 and wow32 + wow64 on vm3. Unfortunately the 'Submit job' page is pretty primitive and systematically creates tasks that do all 3 builds: win32, wow32 and wow64. Since none of the wtbdabian9 VMs had all three, one Wine build was always way out of date resulting in long build times and timeouts. So I had to go back to a single Linux VM until I can send a better submit jobs page.
* The next issue came when a security update on winehq.org broke Net::SSH2, thus preventing the TestBot from connecting to the VMs and sending the patches or executables to test. After some investigation I decided that Net::SSH2 is a lost cause (to be polite) and I switched the TestBot to Net:OpenSSH.
* At about the same time the commit 47242d25f5b2 moved string.c to libwine_port and somehow that broke the 64 bit reg.exe. reg.exe is the first call the TestBot makes to create a new WinePrefix to disable the crash dialog. So of course when reg.exe crashes the crash dialog pops up and the WinePrefix creation remains stuck. This means the Linux 'Update Wine' tasks remain stuck too, for 1h15 a piece, three times, and eventually Wine remains out of date :-(
So there we are. The TestBot is slowly catching up on its backlog (120 tasks to go) and hopefully, once the reg.exe issue is solved, the next month will see fewer crises.