Re: [PATCH] dnsapi: Add DnsGetCacheDataTable stub

2 Sep 2019

      On 8/31/19 11:47 AM, Francois Gouget wrote:
...
On Fri, 30 Aug 2019, Rémi Bernon wrote:
...
On 8/30/19 3:03 PM, Marvin wrote:
...
Hi,
While running your changed tests, I think I found new failures.
Being a bot and all I'm not very good at pattern recognition, so I might be
wrong, but could you please double-check?
Full results can be found at:
https://testbot.winehq.org/JobDetails.pl?Key=56052
Your paranoid android.
=== build (build log) ===
Task errors:
BotError: The VM is not powered on
I did a successful run with the same patch here:
https://testbot.winehq.org/JobDetails.pl?Key=56051
Yes, here's what happened:

When it has nothing to do the TestBot picks some VMs that it starts up
 in advance in the hope they will be needed by the next job.

Because the build VM is used to provide the Windows binaries for
 testing on Windows it's needed by almost every job. So its given a
 high priority and ends up being prepared in advance and thus is
 recorded by the TestBot as being in the idle state.

But then there was a power outage so all the VMs got powered off.

But the TestBot server is on a separate location and was not powered
 off so it was not aware that the VMs got powered off. The thing is
 these days the Engine never uses libvirt because these calls are
 blocking which means if it tries to communicate with a dead VM host of
 one where libvirt is hosed, these calls can block for a long time (up
 to 10 minutes), which would block the Engine for all that time.
 Instead it assumes the information it has in its database about the VM
 is accurate and forks a process whenever it needs to perform an
 operation on a VM, whether that's running a task, shutting it down or
 reverting it.

So it just scheduled the taks on the build VM as usual. But the
 child process could not communicate with the VMs, checked its state
 and complained that there was an error because "The VM is not
 powered on".

What's wrong is that it marked the task as failed. A better recovery
mechanism would have been to either mark the VM as "dirty" or "offline"
and put the task back in the queued state so the TesBot tries running it
again.
The risk is that if the reason why the VM is not usable is not caused by
an external factor (such as here), the next round is likely to produce
the same result, leading the TestBot to try to run the same highest
priority task again and again on the one borked VM.
Finally the reason why you won't see that job as failed if you look a it
now is because I restarted it. The user who submitted a job that failed
due to a TestBot error gets a button to restart it. A user can only
restart his own jobs and I'm not sure it that would have been possible
in this case since the job came from a wine-devel email (but the
administrator gets to restart anyone's jobs ;-).
Anyway I'll see about tweaking the task scripts to avoid this situation
in the future.
Thanks for the details!
-- 
Rémi Bernon rbernon@codeweavers.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [PATCH] dnsapi: Add DnsGetCacheDataTable stub