On Sat, Apr 20, 2019 at 11:35 PM Francois Gouget [email protected] wrote:
Here are some things I've learned about PCI-passthrough recently, which would be one way (probably the best) to add "real hardware" to the TestBot.
I don't want to give anyone false hopes though: this just went from "this is a mysterious thing I need to learn about" to "I think I know how to do it but have not tried it yet".
So graphics card PCI-passthrough is now relatively well documented on the Internet and seems to have seen some use-cases that would indicate it may even be reasonably usable.
There are two machines intended to run real GPU tests for Wine: cw1-hd6800 and cw2-gtx560. For now they are only used to run WineTest daily on Windows 8.1, Windows 10 1507, 1709, 1809 and Linux. That's quite a bunch but it would be much better if they were integrated with the TestBot as that would allow developers to submit their own tests. So I had a look at what it would imply to convert them to VM hosts using QEmu + PCI-passthrough.
First one needs a processor with hardware virtualisation support. For Intel that's VT-d. Both machines have an Intel Core 2600 which supports VT-d. Good.
Second the motherboard too needs to support VT-d. Both machines have an ASRock P67 Extreme4 motherboard. Unfortunately UEFI says "unsupported" next to the "VT-d" setting for the motherboard :-( It looks like there was some confusion as to whether the P67 chipset supported VT-d initially. From what I gathered it's only Q67 that does but this caused some manufacturers, among which ASRock, to initially claim support and later retract it.
From memory this Asrock board likely works okay. Back in the days we
were a very early adopter of Vt-d/iommu working closely with Intel / Nvidia. Especially first gen i7 motherboards were very, very buggy with vt-d. Often not supporting it or else advertising support and having bad bugs preventing it from working. I had to test dozens of motherboards. Asrock at that time generally worked.
Then one needs to add the intel_iommu=on option to the kernel command line (resp. amd_iommu). This is should make all the PCI devices appear in /sys/kernel/iommu_groups. But that folder remains empty which confirms that full VT-d support is missing.
Another important aspect is to have a graphics card which is hot-restartable. In some cases when a VM's graphics card is crashed the only way to reset it is to reboot the host. The TestBot is likely to crash the graphics card, particularly if we do a hard-power off on the VMs like we currently do, and it would relaly be annoying to have to reboot the host everytime the graphics card goes belly up. I don't know if the AMD HD6800 and Nvidia GTX560 are suitable but it's quite possible they are not. All I know for now is that we should avoid AMD's R9 line of graphics cards. I still need to find a couple of suitable reasonably lower power graphics cards: one AMD and one Nvidia.
AMD generally works fine. Nvidia well, let's just say they are not nice and purpose work against virtualization. The driver has an if-statement blocking non-professional cards. There are workarounds, but it is a cat and mouse game. Don't bother with these. Just get a "cheap" Quadro P1000 / P2000 card and avoid the hassles. (I do have some special Nvidia virtualization capable hardware left, but it is dated by now. I think I have some special Geforce 460 / 480 / 560 and a Tesla model. If needed I could share some)
For an AMD card, the main hassles are that some of them have "PCIe reset" issues, which may prevent a VM from booting the card. AMD is not like Nvidia trying to block virtualization on consumer cards. Their Radeon Pro cards can sometimes be a little better. A cheap Radeon Pro WX2100 is for example a fine card.
- Then one needs to prevent the host from using the graphics card. Usually that's done by having the host use the processor's IGP and dedicating the discrete GPU to the VMs. Unfortunately the 2600's IGP cannot be active when there's a discrete card so that route is denied to us. Fortunately there's quite a bit of documentation on how to shut down not just X but also the Linux virtual consoles to free the GPU and hand it over to the VMs after boot. Doing so means losing KVM access to the host which is a bit annoying in case something goes wrong. So ideally we'd make sure this does not happen in grub's "safe mode" boot option.
More for future boxes, I'm not sure if you have physical access to these boxes or how they are maintained. If you ever upgrade to a new spec, if you can I would almost go for systems with IPMI, though not that common on consumer boards you often need more a workstation board. Benefits are you can remotely manage the systems (power on/off, serial console, VGA...) and you also have a dumb VGA you can use. Of course a cheap card can work too, but you may like remote management and be able to just put a "farm" somewhere in a corner without keyboard and monitor.
- Although I have not done any test yet I'm reasonably certain that PCI-passthrough rules out live snapshots: QEmu would have no way to restore the graphics card's internal state.
Correct, hardware state is an issue. (For professional uses Nvidia provides such feature). One workaround, which kind of worked at the time was to enter sleep mode in which drivers need to handle some state recovery. It worked for games, but probably not worth the effort at all.
For Windows VMs that's not an issue: if we provide a power off snapshot the TestBot already knows how to power on the VM and wait for it to boot (as long as the boot is shorter than the connection timeout but it works out usually).
For Linux VM's that's more of an issue: the TestBot will power on the VM as usual. The problem is when it updates Wine: after recompiling everything it deletes the old snapshot and creates a new one from the current state of the VM, which means a live snapshot. So the TestBot will need to be modified so it knows when and how to power off the VM and take a powered off snapshot.
Since the VM has full control of the graphics card QEmu has no access to the content of the screen. That's not an issue for the normal TestBot operation, just for the initial VM setup. Fortunately the graphics card is connected to a KVM so the screen can be accessed through that means. It does mean assigning the mouse and keyboard to the VM too. Should that prove impractical there are a bunch of other options too: VNC, LookingGlass, Synergy, etc. But the less needs to be installed in the VMs the better.
Also the TestBot uses QEmu to take the screenshots. But QEmu does not have access to the content of the screen. The fix is to use a tool to take the screenshots from within the VM and use TestAgent to retrieve them. On Linux there are standard tools we can use. On Windows there's code floating around we can use.
So the next steps would be:
- Maybe test on my box using the builtin IGP. But that likely won't be very conclusive beyond confirming the snapshot issues, screen access, etc.
- Find a suitable AMD or Nvidia graphics card and test that on my box. That would allow me to fully test integration with the TestBot, check for stability issues, etc.
- Then see what can be done with the existing cw1 and cw2 boxes.
Overall pcie passthrough is definitely the way to go. We have used in a huge capacity for years and it works very well. I would recommend using it here too.
Thanks, Roderick