https://bugs.winehq.org/show_bug.cgi?id=48052
Bug ID: 48052 Summary: kernel32:debugger - Wine randomly fails to get the thread context Product: Wine Version: unspecified Hardware: x86 OS: Linux Status: NEW Severity: normal Priority: P2 Component: kernel32 Assignee: wine-bugs@winehq.org Reporter: fgouget@codeweavers.com Distribution: ---
The kernel32:debugger test randomly fails with a variable number of these failures:
debugger.c:320: Test failed: GetThreadContext failed: 5 debugger.c:320: Test failed: GetThreadContext failed: 5 debugger.c:320: Test failed: GetThreadContext failed: 5 debugger.c:320: Test failed: GetThreadContext failed: 5
See https://test.winehq.org/data/tests/kernel32:debugger.html
These happen in test_debugger(), when fetch_process_context() tries to get the context of each thread. In Wine GetThreadContext() randomly fails causing a variable number of the above messages depending on how many threads are impacted.
All three Wine builds are impacted. The ERROR_ACCESS_DENIED probably comes from get_handle_obj() in server/handle.c. Given that this test was added to test a Wine exception handling race (*), maybe the race is still present?
(*) Based on the commit message for: commit 42a9b58b066ff374148cee088ac57cfba46e3b76 Author: Jacek Caban jacek@codeweavers.com AuthorDate: Mon Aug 26 19:37:20 2019 +0200
kernel32/tests: Add exception handling race test.
Signed-off-by: Jacek Caban jacek@codeweavers.com Signed-off-by: Alexandre Julliard julliard@winehq.org
The tests were later enabled on Wine by:
commit 27da4fa4498c3b6e0e46c72a9cdd13b620726e92 Author: Jacek Caban jacek@codeweavers.com AuthorDate: Fri Sep 20 16:33:53 2019 +0200
kernel32/tests: Enable debug break exception race tests on i386 Wine.
Signed-off-by: Jacek Caban jacek@codeweavers.com Signed-off-by: Alexandre Julliard julliard@winehq.org
https://bugs.winehq.org/show_bug.cgi?id=48052
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |source, testcase
https://bugs.winehq.org/show_bug.cgi?id=48052
Jacek Caban jacek@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jacek@codeweavers.com
--- Comment #1 from Jacek Caban jacek@codeweavers.com --- This race is not related to the race I was working on when writing those tests. This seems related to a known limitation (race?) in Wine: https://source.winehq.org/git/wine.git/blob/HEAD:/dlls/ntdll/thread.c#l819
Due to the way get_thread_context works, it may timeout in Wine and STATUS_ACCESS_DENIED is returned in such case. This cannot happen on Windows. I guess we could try to limit thread number based on CPU count to avoid CPU load (we currently run 20 threads, each one is running a loop).
While looking at that I noticed that all TestBot VMs run on a single core. Is there a reason for that? It does not really reflect how things are ran on real hardware those days...
https://bugs.winehq.org/show_bug.cgi?id=48052
--- Comment #2 from François Gouget fgouget@codeweavers.com --- Normally all the Home Edition Windows VMs are configured with 1 processor and 2 cores (*). For instance if I open the Resource Monitor in Windows 10 1809 I do see the usage on each of the two CPU cores. I probably kept a single core machine so we do have one like this somewhere.
And the Professional and Server editions should have 2 processors with 1 core each because licensing.
The build and Wine VMs not being encumbered with licensing issues and needing to do builds, they get as many cores as possible. That would be 4 on vm3 and vm4 and 8 (counting hyperthreading) on vm1, but presented as as many single core processors because that' the QEmu default. But I could change that to present a more interesting CPU topology.
We could give the Windows VMs more processors & cores but: * Given that the tests don't take advantage of multi-threading this does not seem useful (except for the build and Wine VMs). * The more cores the fewer concurrent VMs we can hope to one day run concurrently without interference.
(*) Initially I had just specified 2 cores and let QEmu deal with the topology but then I noticed Windows was reporting a single core. That's when I realized it was because by default QEmu presents each core as a separate processor and Windows licensing did not allow it to use more than one processor.
https://bugs.winehq.org/show_bug.cgi?id=48052
--- Comment #3 from Jacek Caban jacek@codeweavers.com --- So the actual problem is in get_thread_context(), which is triggered by this test. That's something I plan to look at, it may improve debugger stability as well, but it's not something for the code freeze.
(In reply to François Gouget from comment #2)
Normally all the Home Edition Windows VMs are configured with 1 processor and 2 cores (*). For instance if I open the Resource Monitor in Windows 10 1809 I do see the usage on each of the two CPU cores. I probably kept a single core machine so we do have one like this somewhere.
And the Professional and Server editions should have 2 processors with 1 core each because licensing.
The build and Wine VMs not being encumbered with licensing issues and needing to do builds, they get as many cores as possible. That would be 4 on vm3 and vm4 and 8 (counting hyperthreading) on vm1, but presented as as many single core processors because that' the QEmu default. But I could change that to present a more interesting CPU topology.
We could give the Windows VMs more processors & cores but:
- Given that the tests don't take advantage of multi-threading this does not
seem useful (except for the build and Wine VMs).
- The more cores the fewer concurrent VMs we can hope to one day run
concurrently without interference.
(*) Initially I had just specified 2 cores and let QEmu deal with the topology but then I noticed Windows was reporting a single core. That's when I realized it was because by default QEmu presents each core as a separate processor and Windows licensing did not allow it to use more than one processor.
I found that I was checking it in a wrong way. Still, it's weird that those timeouts happen.
https://bugs.winehq.org/show_bug.cgi?id=48052
--- Comment #4 from Jacek Caban jacek@codeweavers.com --- (In reply to Jacek Caban from comment #3)
So the actual problem is in get_thread_context(), which is triggered by this test. That's something I plan to look at, it may improve debugger stability as well, but it's not something for the code freeze.
Reimplementing get_thread_context() to use proper blocking waits helps a lot, but it ended up being a pretty invasive change: https://testbot.winehq.org/JobDetails.pl?Key=69112
https://bugs.winehq.org/show_bug.cgi?id=48052
Jacek Caban jacek@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED Fixed by SHA1| |c4dab9b76eb7397f6e5325ff4bd | |f792cf500e571
--- Comment #5 from Jacek Caban jacek@codeweavers.com --- This should be fixed in Wine: https://source.winehq.org/git/wine.git/commitdiff/c4dab9b76eb7397f6e5325ff4b...
https://bugs.winehq.org/show_bug.cgi?id=48052
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #6 from Alexandre Julliard julliard@winehq.org --- Closing bugs fixed in 5.7.