https://bugs.winehq.org/show_bug.cgi?id=51319
Bug ID: 51319 Summary: server: spins in a loop consuming 100% of a hardware thread until closed, makes wineserver SCHED_FIFO unsafe. Product: Wine Version: 6.9 Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: wineserver Assignee: wine-bugs@winehq.org Reporter: spambox1@koolhoven-home.net Distribution: ---
wineserver can be induced into an infinite loop of some sort and take up 100% thread/hardware CPU time. This makes setting SCHED_FIFO and SCHED_RR on wineserver unsafe, will cause immediate system lock if attempted.
https://bugs.winehq.org/show_bug.cgi?id=51319
David Koolhoven spambox1@koolhoven-home.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Regression SHA1| |737fe1f99337ab999cf67fb1acd | |f96f70020b3d4 Keywords| |regression Severity|normal |critical
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #1 from David Koolhoven spambox1@koolhoven-home.net --- Star Citizen can reliably produce this problem when loading into the Public Universe.
https://bugs.winehq.org/show_bug.cgi?id=51319
Zebediah Figura z.figura12@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|server: spins in a loop |Star Citizen causes 100% |consuming 100% of a |CPU usage in wineserver |hardware thread until | |closed, makes wineserver | |SCHED_FIFO unsafe. | CC| |z.figura12@gmail.com Severity|critical |normal
--- Comment #2 from Zebediah Figura z.figura12@gmail.com --- In general, a sufficiently determined process can certainly cause high CPU usage in wineserver, yes. That's not really a bug per se.
However, if the aforementioned commit is to blame, that probably means we're polling a fd and not doing anything about the fact that it's signaled, which is a bug. Can you please attach a log with WINEDEBUG=+pid,+server,+winsock,+seh,+timestamp? Ideally with wine 6.11, which has a few tweaks to the way sockets are polled.
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #3 from David Koolhoven spambox1@koolhoven-home.net --- Created attachment 70180 --> https://bugs.winehq.org/attachment.cgi?id=70180 Log slice from StarCitizen + Launcher with thread spinning at 100%
This snippet is probably 17 miliseconds worth. Original log capture was 5 seconds long and 274 megabytes in size. Entire log was captured while wineserver was spinning at 100%. (Which started around the same point while loading the public universe).
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #4 from David Koolhoven spambox1@koolhoven-home.net --- Also I cannot reproduce this with commit 520a84870f300de0422aa815198b3fd5ea6fdab1 (parent)and earlier, but I've been able to reproduce it with 737fe1f99337ab999cf67fb1acdf96f70020b3d4 and later (including 6.11 + commits).
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #5 from Zebediah Figura z.figura12@gmail.com --- Can you please attach the beginning of the log instead? I can see we're trying to poll for 0 and getting POLLHUP, but I don't know why we're in that state.
https://bugs.winehq.org/show_bug.cgi?id=51319
David Koolhoven spambox1@koolhoven-home.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #70180|0 |1 is obsolete| |
--- Comment #6 from David Koolhoven spambox1@koolhoven-home.net --- Created attachment 70188 --> https://bugs.winehq.org/attachment.cgi?id=70188 Possible capture of beginning of issue.
I think this log includes the last of the calls before the issue takes over the log file.
I interleaved the output of a filtered top command into this, and a long while after this wineserver has a 100% entry. The whole log was 5 GB, and even though the time from the beginning of the event to the end of this log is only 3-5% of the time the log was being recorded, including 7 minutes of loading data from drive, it accounts for half the log size.
I tried to get the largest chunk of the log that seemed relevant that would fit into the 10 MB upload limit.
Thank you!
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #7 from David Koolhoven spambox1@koolhoven-home.net --- I also saw some of the newer commits which might effect this, I can build off git to test if that might be helpful.
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #8 from Zebediah Figura z.figura12@gmail.com --- Unfortunately that's still not early enough, sorry.
If you're running into size limits, note that you can compress the file before uploading; log files compress very well.
https://bugs.winehq.org/show_bug.cgi?id=51319
David Koolhoven spambox1@koolhoven-home.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #70188|0 |1 is obsolete| |
--- Comment #9 from David Koolhoven spambox1@koolhoven-home.net --- Created attachment 70192 --> https://bugs.winehq.org/attachment.cgi?id=70192 End of Log with CPU usage included.
Complete log was captured, this time I left the CPU usage lines in. They can be trimmed out with something like `sed '/wineserver CPU usage from top/d'`. I have a log of the full run and can snip out earlier parts if requested this time. The CPU usage lines are prefixed with a timestamp and the full log ranges from 1624548363.477506 to 1624548673.525148. This part of the log uploaded accounts for 13 of 310 seconds captured.
This log contains the first entry with 100% usage. Reading the CPU usage for the beginning of the issue in this log is tricky because wineserver seems to compete with `cat` for core use.
Thanks again!
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #10 from David Koolhoven spambox1@koolhoven-home.net --- Created attachment 70193 --> https://bugs.winehq.org/attachment.cgi?id=70193 Capture may include start of issue.
Same as the previous, this one ends at the line just before the last one starts. Accounts for T = -13 to -37 (24 seconds length).
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #11 from David Koolhoven spambox1@koolhoven-home.net --- I almost forgot to mention, this run was captured with daily build from git: ad03df1222c2bb22e991641dcc0d9e4ed684158b
https://bugs.winehq.org/show_bug.cgi?id=51319
Sveinar Søpler cybermax@dexter.no changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |cybermax@dexter.no
--- Comment #12 from Sveinar Søpler cybermax@dexter.no --- Created attachment 70196 --> https://bugs.winehq.org/attachment.cgi?id=70196 World of warcraft retail
Seems to be the same problem for World of Warcraft (retail) aswell.
Reverting 737fe1f99337ab999cf67fb1acdf96f70020b3d4 does fix it, but seems to create problems with the Battle.net app (browsing++ does not work correctly).
The log is with WineHQ (ubuntu repo) wine-staging-6.11
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #13 from Sveinar Søpler cybermax@dexter.no --- Created attachment 70204 --> https://bugs.winehq.org/attachment.cgi?id=70204 Workaround patch
Found this patch on another site, and it does seem to fix the problems for this issue, atleast for wine-staging-6.11, and World of Warcraft + Battle.net.
PS. Do not revert 737fe1f99337ab999cf67fb1acdf96f70020b3d4
Just incase someone want to do some testing.
https://bugs.winehq.org/show_bug.cgi?id=51319
David Koolhoven spambox1@koolhoven-home.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |patch
https://bugs.winehq.org/show_bug.cgi?id=51319
David Koolhoven spambox1@koolhoven-home.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|Star Citizen causes 100% |Star Citizen, WoW causes |CPU usage in wineserver |100% CPU usage in | |wineserver
https://bugs.winehq.org/show_bug.cgi?id=51319
Chiitoo escomk3@hotmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |escomk3@hotmail.com
--- Comment #14 from Chiitoo escomk3@hotmail.com --- It looks like this might affect Steam as well, where 'wineserver' will start consuming all of one thread usually after simply idling for about 10 minutes in online mode.
Reverting the mentioned commit seems to fix it there too.
I also saw something similar with Battle.net some time ago, but haven't tested that more.
(I didn't have the chance to do extensive testing yet, so this could all just be a big coincidence.)
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #15 from David Koolhoven spambox1@koolhoven-home.net --- I'm just confirming this is still happening with daily build from 7f1623bc626d3ca2411c1a3088512d8ef461252b
I hope to test and review that patch soon.
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #16 from David Koolhoven spambox1@koolhoven-home.net --- Comment on attachment 70204 --> https://bugs.winehq.org/attachment.cgi?id=70204 Workaround patch
The closest to a source I can find on this patch is Etienne Juvigny claiming it was created by Torge Matthies in this commit off-site: https://github.com/Frogging-Family/wine-tkg-git/commit/112db36f4f84dc124cc7739c9523992343f91882#diff-3fafde74252378b4a03e7feef6bd76f5e3df02728b549082b3511da388594a44
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #17 from David Koolhoven spambox1@koolhoven-home.net --- Issue can still be reproduced with daily commit: 52ba1b498a9694daf804c9aea99c788bb4e753a3
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #18 from David Koolhoven spambox1@koolhoven-home.net --- Comment on attachment 70204 --> https://bugs.winehq.org/attachment.cgi?id=70204 Workaround patch
Just tested this in it's two parts one at a time, and it only works when both are present. If shutdown sockets with no events are pulled without the event on non-listening sockets being set to not return, connections will not be re-used causing stalls. The whole patch works great though.
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #19 from David Koolhoven spambox1@koolhoven-home.net --- I sent a version of the patch into the patch list. I'll update if it's accepted or rejected.
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #20 from Sveinar Søpler cybermax@dexter.no --- (In reply to David Koolhoven from comment #19)
I sent a version of the patch into the patch list. I'll update if it's accepted or rejected.
It seems as wine-6.12 (and also wine-staging-6.12.1) completely broke Battle.net connectivity, so it is kind of hard to test if any new patches that fixes this particular wineserver problem is being implemented.
Have you tested wine-staging-6.12.1 and see if you can log on? (Kinda seems as it is waiting on Agent.exe update, but not easy to see). Log from Agent.exe
Wine-6.12:
[I 2021-07-04 12:43:40.0322] Handle Event: "update agent event" [I 2021-07-04 12:43:40.0322] Checking Network Connection status to http://us.patch.battle.net:1119/agent [I 2021-07-04 12:43:40.0852] Network Connection Response to http://us.patch.battle.net:1119/agent - CURL error: 0, Status Code: 404
<= Wine-6.11 [I 2021-07-03 20:03:25.0332] Handle Event: "update agent event" [I 2021-07-03 20:03:25.0335] Checking Network Connection status to http://eu.patch.battle.net:1119/agent [I 2021-07-03 20:03:25.0335] Request GET /agent
If Agent.exe cant connect = no dice afaik.
https://bugs.winehq.org/show_bug.cgi?id=51319
Roman Pišl rpisl@seznam.cz changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |rpisl@seznam.cz
https://bugs.winehq.org/show_bug.cgi?id=51319
Béla Gyebrószki gyebro69@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |gyebro69@gmail.com
https://bugs.winehq.org/show_bug.cgi?id=51319
Zebediah Figura z.figura12@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Fixed by SHA1| |361435f6095f8c759979600b06e | |d28785e7b3aec Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED
--- Comment #21 from Zebediah Figura z.figura12@gmail.com --- Should be fixed by https://source.winehq.org/git/wine.git/commitdiff/361435f6095f8c759979600b06ed28785e7b3aec.
https://bugs.winehq.org/show_bug.cgi?id=51319
--- Comment #22 from Chiitoo escomk3@hotmail.com --- For what it is worth, I still see the issue with Steam at 361435f6, however with the one after that, 9bc5bc7c [1], things look better there.
1. https://source.winehq.org/git/wine.git/commit/9bc5bc7c6628a69cef6e64facb8eb7...
https://bugs.winehq.org/show_bug.cgi?id=51319
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #23 from Alexandre Julliard julliard@winehq.org --- Closing bugs fixed in 6.14.