https://bugs.winehq.org/show_bug.cgi?id=37104
Bug ID: 37104 Summary: Infinite revert loop Product: Wine-Testbot Version: unspecified Hardware: x86 OS: Linux Status: NEW Severity: normal Priority: P2 Component: unknown Assignee: wine-bugs@winehq.org Reporter: fgouget@codeweavers.com
The current task scheduling algorithm can enter into an infinite revert loop while trying to prepare VMs for the next tasks. Assume the following settings: $MaxRevertingVMs = 2; $MaxRevertsWhileRunningVMs = 0; $MaxActiveVMs = 2;
Then the following sequence can play out: | Steps VM | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 ----+-----+-----+-----+-----+-----+-----+-----+----- vm1 | rev | idl | off | off | rev | rev | rev | ... vm2 | rev | rev | rev | idl | off | off | rev | ... vm3 | off | off | rev | rev | rev | idl | off | ...
The issue happens in steps 2, 4 and 6. The scheduler can shut down idle VMs to replace them with VMs that are more appropriate for the upcoming tasks. This is what happens in these steps: it decides the idle VM it just prepared is not what it wants after all, and thus shuts it down and prepares another one.
The problem is it keeps changing its mind over and over and can never actually start a task because there is always a reverting VM and $MaxRevertsWhileRunningVMs = 0.
Another prerequisite for this scenario to play out is probably to have multiple tasks have the exact same priority, so that their order is undefined. But regardless, the scheduler should probably not be shutting down an idle VM that has an actual 'pending' task.
https://bugs.winehq.org/show_bug.cgi?id=37104
--- Comment #1 from François Gouget fgouget@codeweavers.com --- Note that setting $MaxRevertsWhileRunningVMs to 1 or more seems to be an effective workaround. Fortunately that's the current WineTestBot configuration.
https://bugs.winehq.org/show_bug.cgi?id=37104
Sebastian Lackner sebastian@fds-team.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |sebastian@fds-team.de
https://bugs.winehq.org/show_bug.cgi?id=37104
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED
--- Comment #2 from François Gouget fgouget@codeweavers.com --- This is fixed. It may have already been fixed before the latest scheduler rewrite since, if I am not mistaken, the TestBot has been running with $MaxRevertsWhileRunningVMs = 0 for quite some time.
In any case this definitely does not (and should not) happen with the new scheduler.
commit 66723015efdc15538bbb8c01ea91543f68893627 Author: Francois Gouget fgouget@codeweavers.com Date: Mon Feb 19 04:25:04 2018 +0100
testbot: Make the job scheduler more extensible.
- The new scheduler splits the work into smaller steps: assessing the current situation, starting tasks on idle VMs and building a list of needed VMs, reverting the VMs, powering off the remaining VMs, updating the activity records. Each part can be analyzed independently. It uses the $Sched structure to pass information between these functions. - Sometimes a VM that no Task needs must be powered off in order to be able to prepare a VM that is needed. The job of picking which VM to sacrifice is now delegated to _SacrificeVM(). - The scheduler used to handle each VM host independently. As a result it was unable to prepare a VM for the 'next step' if that VM was on another VM host. The lack of a global picture also made many other extensions impossible. The new scheduler handles scheduling on all VM hosts at the same time, thus solving this issue. - This and other improvements mean the scheduler no longer needs to loop over the jobs and tasks multiple times. - In order to respect the per-VM-host limits the scheduler stores the host-related counters and limits in the $Sched->{hosts} table. - The scheduler also used to build multiple lists of VMs to revert depending on whether they were needed now, for the next step, or for future jobs. The new scheduler builds a single prioritised list of VMs to revert which can be handled in one go. It also keeps more information so it can better decide which VM to prepare next. - The scheduler can now also prepare VMs for the 'next step' earlier, thus making it more likely they will be ready in time.
Signed-off-by: Francois Gouget fgouget@codeweavers.com Signed-off-by: Alexandre Julliard julliard@winehq.org
https://bugs.winehq.org/show_bug.cgi?id=37104
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #3 from Austin English austinenglish@gmail.com --- Closing.