https://bugs.winehq.org/show_bug.cgi?id=48040
Bug ID: 48040 Summary: Allow running more than one VM per host Product: Wine-Testbot Version: unspecified Hardware: x86 OS: Linux Status: NEW Severity: normal Priority: P2 Component: unknown Assignee: wine-bugs@winehq.org Reporter: fgouget@codeweavers.com Distribution: ---
The TestBot VM hosts have 4 to 8 cores. Wine's tests are essentially single threaded, typically remain far from maxing out even a single core, and don't have that much I/O requirements except for the msi tests. But a number of tests are timing sensitive (audio tests) in that a delay of a fraction of a second can make them fail.
It was found a long time ago that running two or more WineTest instances in separate VMs concurrently would cause extra test failures. The reason for these failures was either timeouts in the msi tests (so slow I/O) or timing related in the audio tests. But then there are even more random test failures at the time, making assessment tricky.
So while the TestBot can run an arbitrary number of concurrent VMs per host, its current configuration limits it to just one VM at a time.
There are a number of evolutions that make this situation less and less tenable: * We have more and more Windows configurations to test, whether that's because of new Windows releases, or new configurations such as dual-screen, locales, etc. * Tests on Wine involve longer rebuilds that just building the Windows test executables and would benefit greatly from more cores. * Future hosts are more likely to get 8, 12 or 16 core CPUs (+hyperthreading) with SSDs.
With the current limit scaling up means adding more underutilized VM hosts. So this limit should be reevaluated and was way to lift it found if there are still issues.
* Find a way to reliably assess whether one configuration provides worse results than another despite the possible presence of random failures.
* At the time qcow2 disk I/O seemed to have a global lock issue which may have been responsible for some of the poor I/O performance and scheduling delays. -> Check whether that's still the case and if there are workarounds.
* There are two I/O models: native and threaded. -> Check if one configuration is better than the other with regards to scheduling issues and interference across VMs.
* Some gamers report that vcpu pinning can reduce latency variations. Also tweaking the vcpu topology is said to help sometimes. -> This sounds like something that would be beneficial for our audio tests so investigate it. Should the pinning be done statically or set by the TestBot before starting up the VM based on the set of already running VMs. In the case of a static allocation, how should the exclusion patterns be communicated to the TestBot? https://mathiashueber.com/cpu-pinning-on-amd-ryzen/
https://www.reddit.com/r/VFIO/comments/7zcn5g/kvm_windows_10_guest_cpu_pinni...