From: Paul Gofman pgofman@codeweavers.com
--- dlls/ntdll/unix/socket.c | 59 +++++++++++++++++++++++++++++++++++++--- dlls/ws2_32/tests/sock.c | 6 ++-- 2 files changed, 58 insertions(+), 7 deletions(-)
diff --git a/dlls/ntdll/unix/socket.c b/dlls/ntdll/unix/socket.c index 577b48b1336..0b56594eb98 100644 --- a/dlls/ntdll/unix/socket.c +++ b/dlls/ntdll/unix/socket.c @@ -1134,10 +1134,61 @@ static NTSTATUS sock_send( HANDLE handle, HANDLE event, PIO_APC_ROUTINE apc, voi
/* If we had a short write and the socket is nonblocking (and we are * not trying to force the operation to be asynchronous), return - * success. Windows actually refuses to send any data in this case, - * and returns EWOULDBLOCK, but we have no way of doing that. */ - if (status == STATUS_DEVICE_NOT_READY && async->sent_len) - status = STATUS_SUCCESS; + * success, pretened we've written everything to the socket and queue writing + * remaining data. Windows never reports partial write in this case and queues + * virtually unlimited amount of data for background write in this case. */ + if (status == STATUS_DEVICE_NOT_READY && async->sent_len && async->iov_cursor < async->count) + { + struct iovec *iov = async->iov + async->iov_cursor; + SIZE_T data_size, async_size, addr_size; + struct async_send_ioctl *rem_async; + unsigned int i, iov_count; + IO_STATUS_BLOCK *rem_io; + char *p; + + TRACE( "Short write, queueing remaining data.\n" ); + data_size = 0; + iov_count = async->count - async->iov_cursor; + for (i = 0; i < iov_count; ++i) + data_size += iov[i].iov_len; + + addr_size = max( 0, async->addr_len ); + async_size = offsetof( struct async_send_ioctl, iov[iov_count] ) + data_size + addr_size + + sizeof(IO_STATUS_BLOCK); + if (!(rem_async = (struct async_send_ioctl *)alloc_fileio( async_size, async_send_proc, handle ))) + { + status = STATUS_NO_MEMORY; + } + else + { + rem_async->count = iov_count; + p = (char *)rem_async + offsetof( struct async_send_ioctl, iov[iov_count] ); + for (i = 0; i < iov_count; ++i) + { + memcpy( p, iov[i].iov_base, iov[i].iov_len ); + rem_async->iov[i].iov_base = p; + p += iov[i].iov_len; + rem_async->iov[i].iov_len = iov[i].iov_len; + } + rem_async->unix_flags = async->unix_flags; + memcpy( p, async->addr, addr_size ); + rem_async->addr = (const struct WS_sockaddr *)p; + p += addr_size; + rem_async->addr_len = async->addr_len; + rem_async->iov_cursor = 0; + rem_async->sent_len = 0; + rem_io = (IO_STATUS_BLOCK *)p; + p += sizeof(IO_STATUS_BLOCK); + status = sock_send( handle, NULL, NULL, NULL, rem_io, fd, rem_async, TRUE ); + if (status == STATUS_PENDING) status = STATUS_SUCCESS; + if (!status) + { + async->sent_len += data_size; + async->iov_cursor = async->count; + } + else ERR( "Remaining write queue failed, status %#x.\n", status ); + } + }
set_async_direct_result( &wait_handle, options, io, status, async->sent_len, FALSE ); } diff --git a/dlls/ws2_32/tests/sock.c b/dlls/ws2_32/tests/sock.c index 437ee813611..86933d80977 100644 --- a/dlls/ws2_32/tests/sock.c +++ b/dlls/ws2_32/tests/sock.c @@ -6883,9 +6883,9 @@ static void test_write_events(struct event_test_ctx *ctx)
if (!broken(1)) { - /* Windows will never send less than buffer_size bytes here, but Linux - * may do a short write. */ - while ((ret = send(server, buffer, buffer_size, 0)) > 0); + /* Windows will never send less than buffer_size bytes here. */ + while ((ret = send(server, buffer, buffer_size, 0)) > 0) + ok(ret == buffer_size, "got %d.\n", ret); ok(ret == -1, "got %d\n", ret); ok(WSAGetLastError() == WSAEWOULDBLOCK, "got error %u\n", WSAGetLastError());
Hi,
It looks like your patch introduced the new failures shown below. Please investigate and fix them before resubmitting your patch. If they are not new, fixing them anyway would help a lot. Otherwise please ask for the known failures list to be updated.
The tests also ran into some preexisting test failures. If you know how to fix them that would be helpful. See the TestBot job for the details:
The full results can be found at: https://testbot.winehq.org/JobDetails.pl?Key=148417
Your paranoid android.
=== w7u_2qxl (32 bit report) ===
ws2_32: sock.c:3172: Test failed: got 0, i 100.
=== w1064_2qxl (64 bit report) ===
ws2_32: sock.c:13198: Test failed: wait timed out
=== debian11b (64 bit WoW report) ===
user32: input.c:4305: Test succeeded inside todo block: button_down_hwnd_todo 1: got MSG_TEST_WIN hwnd 0000000001AB00E4, msg WM_LBUTTONDOWN, wparam 0x1, lparam 0x320032
Helps JOY OF PROGRAMMING - Software Engineering Simulator. EA service is also known to break on short writes (that doesn't usually happen on Linux with its socket usage pattern but is known to be a problem on Macos).
I tested much larger buffers in Windows, it does queue at least up to 2GB of data with such send (also tested with local network besides localhost). Queuing 2x 2GB writes on nonblocking socket will succeed (even when there is obviously nothing reading the data on another socket end), the third will WSAEWOULDBLOCK. When tested over the local network with the receiving end, these data are continued to be stubbornly sent over network even after the socket is closed and process exited.
Below is the list of possible (remaining) issues / corner cases I could think of: - If setting FIONBIO to 0 on the socket is racing with this short write handling. The asynchronously queued part may become synchronous and the recursive sent will wait on the wait_handle returned by server. That is probably for good, let's pretend that racing FIONBIO happened before we started sock_send(); - If there is racing send(). It may actually interpose the data in between the short written part and the queued one. That will be wrong, although not worse what is happening now: if now short read happens, the racing data may be interposed the same way, regardless of whether the app is going to handle short write or not. This may be fixed with some added synchronization but so far seems unnecessary; - UDP sockets. Short read is not a thing with UDP socket, attempt to send a bigger packet than supported by socket / network results in [WSA]EMSGSIZE both on Windows and Unix. That's why I didn't add a check for socket type; - Spurious overlapped IO events won't be sent anywhere, apc_context for the added background read is 0; - The consequent writes will mind the non-empty socket write queue, those shouldn't be performed before that is finished (and will be queued async, waited ot timed out accordingly); - If the socket is closed (or process exits) before the background send is complete, the remaing queued data won't be sent (unlike Windows). This is not ideal but probably not the end of the world. When doing the same on Windows there is no way the app can know if the sent data actually reached the peer or not (while it will be much more likely to reach on Windows of course). The case with socket close can be fixed with some complication around delaying actual socket close on the server in the presence of such asyncs. The case of process exit is probably harder, the data are currently only stored on the sending process and only sending process does actual sends.
Also helps applications built with older versions of gRPC (untested); see https://bugs.winehq.org/show_bug.cgi?id=10648.
On Fri Sep 13 15:14:21 2024 +0000, Zhiyi Zhang wrote:
There are some test failures. https://testbot.winehq.org/JobDetails.pl?Key=148417#k203
I don't think those can be related. Those are Windows failures in a different test (so not affected by Wine impl changes), and the change in test changes absolutely nothing in API calls, only changes how the result is tested (thus doesn't leave much possibilities to affect the process winsock state to affect other tests).
Thinking more of it, it looks like loosing queued data just on socket close is too bad. It is normal to send the last bit of data and close socket. I am going to look into letting the async complete in this case at once.
WRT the other option suggested in the bug (checking if the data can't be sent without partial write), I don't think there is any other option than storing the data and sending in the background. If we could detect short result before actual send it wouldn't help. A few MB size send can't ever be sent on Unix without partial send() and we can't always return EWOULDBLOCK for such attempts.
On Fri Sep 13 15:14:21 2024 +0000, Paul Gofman wrote:
I don't think those can be related. Those are Windows failures in a different test (so not affected by Wine impl changes), and the change in test changes absolutely nothing in API calls, only changes how the result is tested (thus doesn't leave much possibilities to affect the process winsock state to affect other tests).
Oh, besides that the changes to test are inside 'if (broken(1))' and are no-op at all (while I did test that locally of course with that 'if' disabled). I am going to make that into a separate test I guess.