https://bugs.winehq.org/show_bug.cgi?id=50955
Bug ID: 50955 Summary: .netCore app can't bind to port shortly after another .netCore program binding to the same port was terminated Product: Wine Version: 6.5 Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: -unknown Assignee: wine-bugs@winehq.org Reporter: besentv@gmail.com Distribution: ---
Created attachment 69784 --> https://bugs.winehq.org/attachment.cgi?id=69784 BrokenClient
I provided code for 3 different programs as attachment which represent extremely scaled down code of a bug I tried to find in a proprietary program:
BrokenServer and BrokenClient are both C# programs created for .netCore 3.1 (x86).
BrokenServer creates a socket, tries to bind it to port 41811 and ends up accepting all incoming connections on this port inside an infinite loop.
Broken Client tries to connect to the socked opened by BrokenServer and ends up in an infinite loop(seems important!?).
TestCode is to test what is actually broken in Wine: It runs a BrokenServer and 3 BrokenClients using CreateProcessA() and waits for getchar(). After that it stops all 4 processes using TerminateProcess() and immediately restarts the BrokenServer using CreateProcessA(). The server tries to bind to port 41811 but unlike on Windows it (almost) always fails to do so, showing a MessageBox with the information that the port is already in use.
As mentioned before, the infinite loop in BrokenClient seems to make a difference because I never encountered this issue without it. I wasn't able to recreate this problem with native code. Everything was tested in a clean prefix with .netCore x86 desktop ("windowsdesktop-runtime-3.1.10-win-x86") installed.
https://bugs.winehq.org/show_bug.cgi?id=50955
--- Comment #1 from Bernhard besentv@gmail.com --- Created attachment 69785 --> https://bugs.winehq.org/attachment.cgi?id=69785 BrokenServer
https://bugs.winehq.org/show_bug.cgi?id=50955
--- Comment #2 from Bernhard besentv@gmail.com --- Created attachment 69786 --> https://bugs.winehq.org/attachment.cgi?id=69786 TestCode / TestProgram
https://bugs.winehq.org/show_bug.cgi?id=50955
--- Comment #3 from Bernhard besentv@gmail.com --- After further investigation I found out that TerminateProcess() ends up calling sock_destroy() inside wineserver. This function closes a socket normally causing it to enter the TIME_WAIT state which makes binding a new socket to the same port impossible for some time, however on Windows TerminateProcess() closes the connection with a RST, ACK package. A first solution I found to this issue is to set the socket's linger time to 0 seconds before the call to shutdown() inside sock_destroy(), though I'm not sure if it's a good one.
https://bugs.winehq.org/show_bug.cgi?id=50955
florian.will@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |florian.will@gmail.com
--- Comment #4 from florian.will@gmail.com --- Created attachment 71900 --> https://bugs.winehq.org/attachment.cgi?id=71900 Wine test suite patch to (hopefully) reproduce this issue
ZusiDisplay is affected by this when a train reverses at a station and embedded/integrated displays are re-started by Zusi.
The attached wine testsuite patch reproduces this in different ways (but it fails on the Wine Testbot on Debian, probably due to a race condition where the child process takes longer than 100ms to open the listening socket). After a rather complex first attempt that actually spawns child processes to stay close to what ZusiDisplay and the BrokenClient/BrokenServer reproducer does, I found an easier way:
The same / a very similar issue occurs with just one single-threaded test case (called "test_port_reuse()") that sets up a listening socket, connects to that socket, then closes the connection and the listening socket, then does it again using the same listening port. On Windows, the test passes. Using wine on Linux, the second attempt fails because the port is still "in use" (actually in TIME_WAIT state).
It seems like TIME_WAIT on Windows doesn't affect listening at all, so applications are free to bind to and listen on a port just moments after the previous listening socket was closed. It looks like the Windows kernel just takes extra care to make TCP work despite TIME_WAIT, compared to Linux. As I understand it, TIME_WAIT should prevent new connections using the same <local_ip:local_port, remote_ip:remote_port> combination, so just listening on a port is not an issue as no connection has been made at that point. The remote end will probably use a different port next time (or the connection can be refused otherwise, or even accepted using a larger-than-before initial TCP sequence number).
I'm not sure if there is a good way to emulate this behavior in wine. One way that comes to mind is to set the SO_REUSEADDR socket option for AF_INET sockets before the call to bind() in wineserver . However, this will break other test cases because they expect bind()ing to the same address twice to fail. So the current wine behavior is too restrictive compared to Windows, but always using SO_REUSEADDR is too permissive, and I'm not sure if there is any sensible middle ground. Maybe being too permissive is acceptable in this case? I still need to test if that actually fixes the ZusiDisplay issue (via Proton) though.
https://bugs.winehq.org/show_bug.cgi?id=50955
--- Comment #5 from florian.will@gmail.com --- Created attachment 71901 --> https://bugs.winehq.org/attachment.cgi?id=71901 Patch that "fixes" this issue
This is one way to "fix" this issue, tested to work in ZusiDisplay, but it feels like a terrible hack. It also introduces some test suite failures like "bind succeeded unexpectedly" and EACCESS instead of EADDRINUSE error codes (that last one could probably be fixed).
Bernhard mentioned another way to approach this, setting the SO_LINGER timeout to 0 seconds when destroying the socket in wineserver (I haven't tested this).
Maybe there are more options?
https://bugs.winehq.org/show_bug.cgi?id=50955
--- Comment #6 from florian.will@gmail.com --- Created attachment 71910 --> https://bugs.winehq.org/attachment.cgi?id=71910 Patch for this bug
This patch should fix at least the "after a program binding to the same port was terminated" issue (that is, this bug report).
When a process is terminated and still has connected sockets, this patch makes sure to abort these connections (TCP RST) instead of the normal FIN/ACK sequence through shutdown(). This matches Windows behavior and it skips the TIME_WAIT state, so hopefully it fixes the BrokenClient/BrokenServer? I don't have a c# development environment set up so I can't test. Bernhard, do you still have your test case ready and could give it a try? It does fix the ZusiDisplay issue for me, at least the one time I tried it.
(I also had to change the sock_dispatch_asyncs function to wake up pending asyncs in case of connection abort, otherwise blocking recv() calls would still hang/block after the connection was reset and seemingly never return. Is this another bug, or am I misusing something in wineserver?)
This patch does not fix the general "Windows appears to allow listening on a port even though some old connection using that port is still in TIME_WAIT, but Linux refuses to do that" issue, but it's good enough for me.
https://bugs.winehq.org/show_bug.cgi?id=50955
Jinoh Kang jinoh.kang.kr@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jinoh.kang.kr@gmail.com
--- Comment #7 from Jinoh Kang jinoh.kang.kr@gmail.com --- (In reply to florian.will from comment #6)
Created attachment 71910 [details] Patch for this bug
(snip)
(I also had to change the sock_dispatch_asyncs function to wake up pending asyncs in case of connection abort, otherwise blocking recv() calls would still hang/block after the connection was reset and seemingly never return. Is this another bug, or am I misusing something in wineserver?)
This is #52815.
https://bugs.winehq.org/show_bug.cgi?id=50955
Bernhard Kölbl besentv@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
--- Comment #8 from Bernhard Kölbl besentv@gmail.com --- This was fixed upstream a while ago.
https://bugs.winehq.org/show_bug.cgi?id=50955
Gijs Vermeulen gijsvrm@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Fixed by SHA1| |ef7de1fc1f011bede33c0fa1d08 | |d5feebba6adf7
https://bugs.winehq.org/show_bug.cgi?id=50955
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #9 from Alexandre Julliard julliard@winehq.org --- Closing bugs fixed in 8.9.