[Bug 51442] New: A networking application misbehaves and causes 100% CPU usage in wineserver
https://bugs.winehq.org/show_bug.cgi?id=51442 Bug ID: 51442 Summary: A networking application misbehaves and causes 100% CPU usage in wineserver Product: Wine Version: 6.12 Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: winsock Assignee: wine-bugs(a)winehq.org Reporter: rpisl(a)seznam.cz Distribution: --- Created attachment 70293 --> https://bugs.winehq.org/attachment.cgi?id=70293 WINEDEBUG=+winsock trace BAD A networking application misbehaves and causes 100% CPU usage in wineserver since commit 414b31bc0bbbfe005e90a1946a649082dc303c55 so it is a regression. Reverting commits 414b31bc0bbbfe005e90a1946a649082dc303c55 and 1ccab719ee6e87b7399876d4d5b30eb889c49e32 makes it working again. The setup is complex, but I'll try to provide useful information, testing and extract a minimal reproducible setup eventually. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Roman Pišl <rpisl(a)seznam.cz> changed: What |Removed |Added ---------------------------------------------------------------------------- Distribution|--- |Debian Regression SHA1| |414b31bc0bbbfe005e90a1946a6 | |49082dc303c55 CC| |z.figura12(a)gmail.com Keywords| |regression -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #1 from Roman Pišl <rpisl(a)seznam.cz> --- Created attachment 70294 --> https://bugs.winehq.org/attachment.cgi?id=70294 WINEDEBUG=+winsock trace GOOD Same run with the two commits reverted. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #2 from Zebediah Figura <z.figura12(a)gmail.com> --- Should hopefully be fixed by <https://source.winehq.org/git/wine.git/commitdiff/361435f6095f8c759979600b06ed28785e7b3aec> or <https://source.winehq.org/git/wine.git/commitdiff/9bc5bc7c6628a69cef6e64facb8eb7e3cf2e269b>; please retest with current git or with wine 6.14 when it is released. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #3 from Roman Pišl <rpisl(a)seznam.cz> --- Unfortunately that is not the case. I tried git head with complete rebuild but it is still the same. Still it is needed to revert the two patches to fix the regression and to have the same behavior as on Windows and to wineserver not to take 100% CPU. I'm aware that information from my side is insufficient. Please give me some time to prepare a publishable example to reproduce this. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #4 from Zebediah Figura <z.figura12(a)gmail.com> --- A +winsock,+server trace would also work, actually. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 ihxy <ayafcc(a)163.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ayafcc(a)163.com --- Comment #5 from ihxy <ayafcc(a)163.com> --- I use the wework in Wine6.14 with the same problem. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Roman Pišl <rpisl(a)seznam.cz> changed: What |Removed |Added ---------------------------------------------------------------------------- URL| |https://download.rexcontrol | |s.cz/files/test/wine-bug514 | |42-reproduce.zip --- Comment #6 from Roman Pišl <rpisl(a)seznam.cz> --- This is a race condition that is hard to debug. At least I prepared test that reproduces the problem. Unfortunately it is rather complex so far. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #7 from Roman Pišl <rpisl(a)seznam.cz> --- Also I occasionally encounter following error with cmake+clang compilation under wine: sendmsg: An operation was attempted on something that is not a socket. May be it is related? Somewhere WSAENOTSOCK is returned erroneously? -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Roman Pišl <rpisl(a)seznam.cz> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #8 from Roman Pišl <rpisl(a)seznam.cz> --- This may be as well a hidden bug in the application, don't spare precious time with that. I will eventually prepare a simple test case if it turns out to be a Wine bug. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Roman Pišl <rpisl(a)seznam.cz> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|INVALID |--- Status|RESOLVED |UNCONFIRMED Summary|A networking application |Socket connection is not |misbehaves and causes 100% |established properly |CPU usage in wineserver | --- Comment #9 from Roman Pišl <rpisl(a)seznam.cz> --- I am reopening this as I probably found a valid Wine trace log. The bug was previously called "A networking application misbehaves and causes 100% CPU usage in wineserver" but the CPU usage was was fixed with later commits and is no longer the case with a fresh wineprefix. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #10 from Roman Pišl <rpisl(a)seznam.cz> --- Created attachment 71313 --> https://bugs.winehq.org/attachment.cgi?id=71313 WINEDEBUG=+winsock trace with comments This is an output of WINEDEBUG=+winsock with my attempt to resolve what is going on, hopefully valid. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Gabriel Ivăncescu <gabrielopcode(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |gabrielopcode(a)gmail.com --- Comment #11 from Gabriel Ivăncescu <gabrielopcode(a)gmail.com> --- This is a genuine regression. It happens pretty consistently in Firefox or Pale Moon (I tested 32-bit versions only). To reproduce, just go to mail.google.com, possibly login to your gmail account, browse some mail and the categories on the left. At some point, it will stop loading as if you're offline. When this happens, no other connections will work; you can attempt to go to any other website on the URL bar and it will not connect, but hang indefinitely, once this bug is triggered. Sometimes, it starts hanging as soon as mail.google.com is loaded, and then of course no other connection works anymore, but that happens mostly on Firefox rather than Pale Moon. By "hang" I mean the connections hang, not the rest of the browser. I've bisected it to this exact commit, but unfortunately my skills in this area are lacking so I can't really figure out what's wrong. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #12 from Roman Pišl <rpisl(a)seznam.cz> --- (In reply to Gabriel Ivăncescu from comment #11)
This is a genuine regression. It happens pretty consistently in Firefox or Pale Moon (I tested 32-bit versions only).
Good to hear that there is another way to reproduce.
Sometimes, it starts hanging as soon as mail.google.com is loaded, and then of course no other connection works anymore, but that happens mostly on Firefox rather than Pale Moon. By "hang" I mean the connections hang, not the rest of the browser.
I observe the same symptoms. Also sometimes (but not always) all new socket connections are broken until wineserver is restarted. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #13 from Zebediah Figura <z.figura12(a)gmail.com> --- The minimal test application would probably be easier to debug, but it seems that the link is dead. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Roman Pišl <rpisl(a)seznam.cz> changed: What |Removed |Added ---------------------------------------------------------------------------- URL|https://download.rexcontrol | |s.cz/files/test/wine-bug514 | |42-reproduce.zip | --- Comment #14 from Roman Pišl <rpisl(a)seznam.cz> --- Hi Zebediah, I tried recent Firefox as mentioned in comment 11 and it seems to be the same case. If that doesn't help, I'll prepare a simpler testcase that reproduces what was posted in my comment 10 in following days. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #15 from Roman Pišl <rpisl(a)seznam.cz> --- It seems that bug 51648 is a duplicate of this bug. Loading youtube page works until 414b31bc0bbbfe005e90a1946a649082dc303c55 and still fails with git master (playing a video doesn't but that's a different bug). -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #16 from Zebediah Figura <z.figura12(a)gmail.com> --- FWIW, I'm not perfectly convinced that the test application and firefox suffer from the same bug either. They might be, but I'll believe it when I see it. There are a lot of ways for socket connection to go wrong. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #17 from Roman Pišl <rpisl(a)seznam.cz> --- This bug can be triggered quite reliably without hitting other bugs with: firefox.exe -private -devtools https://www.phoronix.com It is very likely that at least one connection remains stalled a no other content can be downloaded since then. I'm also working on a simple and reliable example to reproduce the bug but without success so far. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #18 from Zebediah Figura <z.figura12(a)gmail.com> --- The Firefox hang appears to be due to stack corruption, from passing a fd_set that is larger than FD_SETSIZE. Since I don't see any such symptoms in the log from comment 10, I'm going to assume that it's a different bug, and I've filed bug 52302 accordingly. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #19 from Roman Pišl <rpisl(a)seznam.cz> --- Hi Zabediah, you were right - the issue is not the same. But thanks to your recent fixes, it starts to clarify! The problem is when connecting a socket in non-blocking mode. It fails multiple times with WSAECONNREFUSED (10061), but why if on localhost? Then, if it succeeds, the socket is switched back to blocking mode by the app but is never marked as ready for writing. I'll compare the behavior with Windows and prepare some test after the weekend. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #20 from Roman Pišl <rpisl(a)seznam.cz> --- What the occasionally failing program does: 1. Spawns other process (that takes some time to initialize and finally listens on local TCP socket) 2. Makes the socket non-blocking - ioctlsocket(fd, FIONBIO, 1) 3. Connects to #1, tests result, ok or error -> done, if WSAEWOULDBLOCK: 4. select(.., NULL, wfdset, timeout) 5. getsockopt(fd, SOL_SOCKET, SO_ERROR, ..) 6. SO_ERROR!=0 && SO_ERROR!=WSAEWOULDBLOCK -> error 7. fd ready for write -> #7 else -> #4 8. Makes the socket blocking - ioctlsocket(fd, FIONBIO, 0) 9. select(.., NULL, wfdset, ..) 10. send() This sometimes runs to 9 but loops there forever, the socket is never marked as ready for write again. Performing #5+#6 also before #4 fixes this. Either it really fixes the problem or changes the timing and hides the real problem. Since the fix exists and it is hard to reproduce it is not critical. I will test again once in a while. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #21 from Zebediah Figura <z.figura12(a)gmail.com> --- Can you by any chance reproduce this with +winsock,+server? -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #22 from Roman Pišl <rpisl(a)seznam.cz> --- (In reply to Zebediah Figura from comment #21)
Can you by any chance reproduce this with +winsock,+server?
Ok, I'll perform future experiments with these options and see what it catches. Unfortunately +server is a big change to the timing (that is problem for Wine, the application itself is single-threaded). -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #23 from Roman Pišl <rpisl(a)seznam.cz> --- Created attachment 71520 --> https://bugs.winehq.org/attachment.cgi?id=71520 WINEDEBUG=+winsock,+server Trace log with +winsock,+server -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #24 from Zebediah Figura <z.figura12(a)gmail.com> --- I suspect, but haven't confirmed, that we're racing between sock_error() and poll(). It looks like connection fails, signaling AFD_POLL_WRITE (is this correct?) while also returning STATUS_CONNECTION_REFUSED. select() throws that status away because it doesn't care, though, and after the request completes the program checks for SO_ERROR, but the error was already swallowed. I'm not sure it's correct that we're signaling AFD_POLL_WRITE, and even if it is we need to check whether select() should be signalling the writefd and returning success here. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #25 from Zebediah Figura <z.figura12(a)gmail.com> --- Created attachment 71537 --> https://bugs.winehq.org/attachment.cgi?id=71537 avoid reporting POLLOUT on connection failure Does the attached patch help? It's not a complete solution, but it should hopefully fix the issue for now. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 --- Comment #26 from Roman Pišl <rpisl(a)seznam.cz> ---
Does the attached patch help? It's not a complete solution, but it should hopefully fix the issue for now.
Yes, it does help! -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Zebediah Figura <z.figura12(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Fixed by SHA1| |51e5995d47b7de9a2d0d6a40f7e | |b3e3c11b83cf2 Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #27 from Zebediah Figura <z.figura12(a)gmail.com> --- Fixed by <https://source.winehq.org/git/wine.git/commitdiff/51e5995d47b7de9a2d0d6a40f7eb3e3c11b83cf2>. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=51442 Alexandre Julliard <julliard(a)winehq.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #28 from Alexandre Julliard <julliard(a)winehq.org> --- Closing bugs fixed in 7.0-rc6. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
participants (1)
-
WineHQ Bugzilla