https://bugs.winehq.org/show_bug.cgi?id=52401
Bug ID: 52401 Summary: Improper synchronization in sock_recv/sock_send leads to arbitrary reordering of completion of I/O requests Product: Wine Version: 7.0-rc6 Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: ntdll Assignee: wine-bugs@winehq.org Reporter: jinoh.kang.kr@gmail.com Distribution: ---
The sock_recv function (dlls/ntdll/unix/socket.c) issues a 'recv_socket' call to the server to complete the I/O request. Prior to calling the server, the function attempts to serve the I/O request locally first by calling try_recv().
The consequence is that there are two types of asynchronous I/O request (hereinafter "Async") sent to the server:
1. Eager: try_recv() call has succeeded before 'recv_socket'. There is no more I/O to be done.
2. Deferred: the request is still pending. When the server senses an incoming packet, async_recv_proc() (in client side) will be called to complete the I/O.
The problem is that eager Asyncs are _not_ prioritized before deferred Asyncs. Therefore, deferred Asyncs may be completed before eagar Asyncs.
The following scenario illustrates the problem:
1. The socket's incoming buffer is initially empty.
2. Client: The application calls WSARecv(). In sock_recv(), try_recv() fails with STATUS_DEVICE_NOT_READY; therefore, a deferred Async is queued to the server. WSARecv() returns with error ERROR_IO_PENDING.
3. The socket receives packet [A] in the meantime. The socket's incoming buffer is no longer empty.
4. Client: The application calls WSARecv() again. In sock_recv(), try_recv() succeeds with packet [A]; therefore, an eager Async is queued to the server.
5. The socket receives packet [B] in the meantime.
6. Server: the poll() loop detects this, and calls async_wake_up( &sock->read_q, status ). This causes APC_ASYNC_IO for deferred Async to be called to the client process.
6. Client: While still in the second sock_recv(), the client does wait_async() on the returned wait handle. This causes the APC_ASYNC_IO (a system APC) to be dequeued.
7. Client (select loop): The client does a server select call. This returns STATUS_KERNEL_APC with APC_ASYNC_IO. The client calls try_recv() (from async_recv_proc), which succeeds with packet [B]. The client process completes the deferred Async with this packet.
8. Client (select loop): The client re-issues the select call after the APC.
8. Server: the wait on the async wait handle is satisfied, causing async_satisified() to be called. This in turn calls async_set_result(), which completes the eager Async (with packet [A]).
9. Client: The client exits sock_recv() and in turn WSARecv(), which reports immediate success.
(Ditto for sock_send and other similar asynchronous I/O requests.)
If the application uses a completion port, it will observe the deferred Async first, and the eager Async second. The deferred Async carries packet [B], while the eager Async carriers packet [A]. This results in the application receiving the packets in the wrong order.
--
Three possible solutions comes to mind:
1. Replace the call to try_recv()/try_send() in sock_recv()/sock_send() with STATUS_DEVICE_NOT_READY. This may slightly reduce performance, since it always defers all I/O requests and forces them to go through the poll() loop.
2. Make async_handoff() immediately call async_set_result() if the status and data are already set (i.e. the I/O has completed synchronously). Since this affects other asynchronous operations as well, I'm not sure this approach is semantically correct.
3. Prioritize immediately satiable async wait handles _before_ system APCs when waiting for objects. This approach too changes semantics, and appears much uglier than other solutions.
https://bugs.winehq.org/show_bug.cgi?id=52401
Jinoh Kang jinoh.kang.kr@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Regression SHA1| |b8f4061df0f71de13abc9edbeb1 | |8063d8359678f CC| |z.figura12@gmail.com Severity|normal |major
https://bugs.winehq.org/show_bug.cgi?id=52401
--- Comment #1 from Jinoh Kang jinoh.kang.kr@gmail.com --- I forgot that special (kernel) APCs can be delivered at any time.
In this case, steps 7 - 9 of the scenario are slightly modified:
1. The socket's incoming buffer is initially empty.
2. Client: The application calls WSARecv(). In sock_recv(), try_recv() fails with STATUS_DEVICE_NOT_READY; therefore, a deferred Async is queued to the server. WSARecv() returns with ERROR_IO_PENDING.
3. The socket receives packet [A] in the meantime. The socket's incoming buffer is no longer empty.
4. Client: The application calls WSARecv() again. In sock_recv(), try_recv() succeeds with packet [A]; therefore, an eager Async is queued to the server.
5. The socket receives packet [B] in the meantime.
6. Server: the poll() loop detects this, and calls async_wake_up( &sock->read_q, status ). This causes APC_ASYNC_IO for the deferred Async to be delivered to the client process.
7. Client (SIGUSR1 handler): The client does a server select call. This returns STATUS_KERNEL_APC with APC_ASYNC_IO. The client calls try_recv() (from async_recv_proc), which succeeds with packet [B]. The client process completes the deferred Async with this packet.
8. Server: the wait on the async wait handle is satisfied, causing async_satisified() to be called. This in turn calls async_set_result(), which completes the eager Async (with packet [A]).
9. Client: The client exits sock_recv() and in turn WSARecv(), which reports immediate success.
It appears that my previously suggested solutions 2 and 3 would be useless in this case. Perhaps block any system APCs in the meantime?
If that's undesirable, we're left with solution #1. That shouldn't be too bad though, since the round trip to the server is inevitable anyway. Perhaps we can optimize this path at the server side (instead of doing eager/deferred processing in the user side).
--
Known impacted application: KakaoTalk 3.3.6.2992 (https://appdb.winehq.org/objectManager.php?sClass=version&iId=40551)
https://bugs.winehq.org/show_bug.cgi?id=52401
lorentz0021@gmail.com lorentz0021@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |lorentz0021@gmail.com
https://bugs.winehq.org/show_bug.cgi?id=52401
--- Comment #2 from Jinoh Kang jinoh.kang.kr@gmail.com --- I realised that more needs to be done to the solution #1, if the socket is in fact synchronous.
Also, testing for this bug seems quite complicated--we need to block kernel APCs until after try_recv() has been run. Can we, well, just block SIGUSR1 temporarily in winetest? Sounds like something we wouldn't do in code freeze.
https://bugs.winehq.org/show_bug.cgi?id=52401
Paul Gofman pgofman@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |pgofman@codeweavers.com
--- Comment #3 from Paul Gofman pgofman@codeweavers.com --- I’d expect solution 1 to have a very unfortunate network performance impact, this will make even fully synchronous socket io always take a server async and possibly SIGUSR roundtrip. Note also that the sockets created with socket() are capable of async io by default, so checking for socket flags won’t change much in the majority of cases. This probably needs something neater. I don’t know, maybe some sort of io request versioning can work?
https://bugs.winehq.org/show_bug.cgi?id=52401
--- Comment #4 from Jinoh Kang jinoh.kang.kr@gmail.com --- (In reply to Paul Gofman from comment #3)
I’d expect solution 1 to have a very unfortunate network performance impact, this will make even fully synchronous socket io always take a server async
We're already doing that. Notice that 'recv_socket' is still called even when try_recv() returns STATUS_SUCCESS; therefore, a server async is still allocated. This behaviour is complete with sock_recv() calling wait_async() on the returned wait_handle, which async_handoff() leaves open for synchronously satisfied requests so that completion ports can be notified and APCs called.
and possibly SIGUSR roundtrip.
This is the real problem. We can solve this by coalescing the 'recv_socket' and 'select' replies into one packet. Basically the algorithm would go like this:
1. The client issues a 'recv_socket' call. 2. The server determines if read_q is empty *and* the request can be served immediately. 3. If this is the case, the server does the following: 1. The server marks the current thread as being in system APC wait. 2. The server bypasses sock->read_q and calls async_terminate() directly. (Alternatively, the server first puts the async in read_q and later calls async_wake_up( &sock->read_q ).) 3. The server dequeues the next system APC (likely APC_ASYNC_IO from 3.2) from the queue. 4. The server inserts the APC into the 'recv_socket' reply.
Ideally we could make it so that every Wineserver call has a flag that indicates whether it could process the original request *and* retrieve the next system APC from the queue in one go. This will eliminate the extra SIGUSR1 round trip in other cases as well.
Note also that the sockets created with socket() are capable of async io by default, so checking for socket flags won’t change much in the majority of cases. This probably needs something neater. I don’t know, maybe some sort of io request versioning can work?
I'm afraid that this problem can be reproduced even with fully synchronous socket I/O operations. Just do the second WSARecv (or recv) call in step 4 in another thread.
https://bugs.winehq.org/show_bug.cgi?id=52401
--- Comment #5 from Jinoh Kang jinoh.kang.kr@gmail.com --- I have a patch that addresses the problem and a handful of other todo_wines; however, I'm still unsure how to effectively test for the try_recv() race condition behaviour.
It's not like we can use server_enter_uninterrupted_section() since it's not available on Windows. Right?
https://bugs.winehq.org/show_bug.cgi?id=52401
--- Comment #6 from Zebediah Figura z.figura12@gmail.com --- I think the correct solution is to get rid of the initial try_recv(), yes. I basically discussed the problem and my proposed solution here:
https://www.winehq.org/pipermail/wine-devel/2021-December/202825.html
https://bugs.winehq.org/show_bug.cgi?id=52401
Zebediah Figura z.figura12@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Regression SHA1|b8f4061df0f71de13abc9edbeb1 | |8063d8359678f |
https://bugs.winehq.org/show_bug.cgi?id=52401
--- Comment #7 from Jinoh Kang jinoh.kang.kr@gmail.com --- sock_send patch submitted: https://source.winehq.org/patches/data/227642
https://bugs.winehq.org/show_bug.cgi?id=52401
Jinoh Kang jinoh.kang.kr@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|7.0-rc6 |7.2
https://bugs.winehq.org/show_bug.cgi?id=52401
Jinoh Kang jinoh.kang.kr@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Fixed by SHA1| |74059d12b83e9f6d6fd64556ba1 | |1d7b5331d5d00 Resolution|--- |FIXED
--- Comment #8 from Jinoh Kang jinoh.kang.kr@gmail.com --- Fixed in Wine 7.5 (https://source.winehq.org/git/wine.git/commit/74059d12b83e9f6d6fd64556ba11d7...).
https://bugs.winehq.org/show_bug.cgi?id=52401
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #9 from Alexandre Julliard julliard@winehq.org --- Closing bugs fixed in 7.6.
https://bugs.winehq.org/show_bug.cgi?id=52401
Zeb Figura z.figura12@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |shlee1503@naver.com
--- Comment #10 from Zeb Figura z.figura12@gmail.com --- *** Bug 51034 has been marked as a duplicate of this bug. ***