https://bugs.winehq.org/show_bug.cgi?id=39425
Bug ID: 39425 Summary: Improve resilience to VM host outages Product: Wine-Testbot Version: unspecified Hardware: x86 OS: Linux Status: NEW Severity: normal Priority: P2 Component: unknown Assignee: wine-bugs@winehq.org Reporter: fgouget@codeweavers.com Distribution: ---
Currently the WineTestBot gets stuck when the connection to the libvirt server on the VM hosts is broken. This includes cases where the libvirt server is restarted, the VM hosts is rebooted or cases where there's a network outage.
The reason is that the Engine queries the status of the VMs itself in some circumstances. This creates a TCP connection which is never recreated in case it breaks.
The proper fix is to banish all such queries from the Engine: not just to fix this issue but also because some of these operation can be long (a few seconds) and block the main loop of the single-threaded Engine, which can in turn cause the website to lag.
https://bugs.winehq.org/show_bug.cgi?id=39425
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED
--- Comment #1 from François Gouget fgouget@codeweavers.com --- This is fixed.
All VM operations now go through LibvirtTool.pl. This avoids the blocking calls in the TestBot Engine also dealing with resetting the broken connections in the Engine.
In addition to the old revert and poweroff operations LibvirtTool.pl now has a monitor operation specially to deal with offline VMs and detecting when they become available again. Most of the time this works just fine and VMs are pout back online automatically but there are still a few cases where this process can get stuck.
The way to deal with those cases is going to be to have the Testbot Engine kill stuck processes. But that's a more general issue (see bug 44688) so I'm considering this one to be fixed.
https://bugs.winehq.org/show_bug.cgi?id=39425
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #2 from Austin English austinenglish@gmail.com --- Closing.