It is possible for the write/writev functions in send_request to return short writes, even in non-error conditions. For example, an unfortunately timed SIGINT after the first 4096 bytes of a pipe transfer will interrupt the syscall and return. In some cases (in particular when no bytes have been transferred at all), the SA_RESTART spec on the signal handler will take care of the restart, but once any bytes have been transferred, the result will be a short write and no automatic restart. Some linux profiling and debugging tooling can also increase the rate of signal interruptions making this corner case more likely.
Make wine robust to this corner case by properly restarting a short write with adjusted buffers. --- dlls/ntdll/unix/server.c | 38 +++++++++++++++++++++++++++++++------- 1 file changed, 31 insertions(+), 7 deletions(-)
diff --git a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c index a388247beb2..a848328f2a9 100644 --- a/dlls/ntdll/unix/server.c +++ b/dlls/ntdll/unix/server.c @@ -183,13 +183,22 @@ static DECLSPEC_NORETURN void server_protocol_perror( const char *err ) static unsigned int send_request( const struct __server_request_info *req ) { unsigned int i; - int ret; + int ret = 0;
+ int to_write = sizeof(req->u.req) + req->u.req.request_header.request_size; if (!req->u.req.request_header.request_size) { - if ((ret = write( ntdll_get_thread_data()->request_fd, &req->u.req, - sizeof(req->u.req) )) == sizeof(req->u.req)) return STATUS_SUCCESS; - + const char *write_ptr = (const char *)&req->u.req; + for (;;) { + ret = write( ntdll_get_thread_data()->request_fd, (void*)write_ptr, + to_write ); + if (ret == to_write) return STATUS_SUCCESS; + else if (ret < 0) break; + // Short write (e.g. due to an unfortunately timed SIGINT) - not + // an error. Adjust remaining write length and buffer and start again. + to_write -= ret; + write_ptr += ret; + } } else { @@ -202,11 +211,26 @@ static unsigned int send_request( const struct __server_request_info *req ) vec[i+1].iov_base = (void *)req->data[i].ptr; vec[i+1].iov_len = req->data[i].size; } - if ((ret = writev( ntdll_get_thread_data()->request_fd, vec, i+1 )) == - req->u.req.request_header.request_size + sizeof(req->u.req)) return STATUS_SUCCESS; + + for (;;) { + ret = writev( ntdll_get_thread_data()->request_fd, vec, i+1 ); + if (ret == to_write) return STATUS_SUCCESS; + else if (ret < 0) break; + // Short write as above. Adjust buffer lengths and start again. + to_write -= ret; + for (unsigned int j = 0; j < i+1; j++) { + if (ret >= vec[j].iov_len) { + ret -= vec[j].iov_len; + vec[j].iov_len = 0; + } else { + vec[j].iov_base = (char *)vec[j].iov_base + ret; + vec[j].iov_len -= ret; + break; + } + } + } }
- if (ret >= 0) server_protocol_error( "partial write %d\n", ret ); if (errno == EPIPE) abort_thread(0); if (errno == EFAULT) return STATUS_ACCESS_VIOLATION; server_protocol_perror( "write" );
Hi Keno,
Op wo 15 dec. 2021 om 11:22 schreef Keno Fischer keno@juliacomputing.com:
// Short write (e.g. due to an unfortunately timed SIGINT) -
not
// an error. Adjust remaining write length and buffer and
start again.
C++-style comments are not permitted in Wine, use /**/ instead.
You'll need to add a Signed-off-by line to your patch as well (same goes for your other patch).
Kind regards, Gijs
On Wed, Dec 15, 2021 at 5:45 AM Gijs Vermeulen gijsvrm@gmail.com wrote:
C++-style comments are not permitted in Wine, use /**/ instead.
Thanks, will fix.
You'll need to add a Signed-off-by line to your patch as well (same goes for your other patch).
Ah, sorry about that. I'll send v2 with that added and the style comment fixed. I also saw that the patch failed on CI, so I'll look into that also, but it's getting late here, so it'll have to wait until morning ;).
Keno
Keno Fischer keno@juliacomputing.com writes:
It is possible for the write/writev functions in send_request to return short writes, even in non-error conditions. For example, an unfortunately timed SIGINT after the first 4096 bytes of a pipe transfer will interrupt the syscall and return. In some cases (in particular when no bytes have been transferred at all), the SA_RESTART spec on the signal handler will take care of the restart, but once any bytes have been transferred, the result will be a short write and no automatic restart. Some linux profiling and debugging tooling can also increase the rate of signal interruptions making this corner case more likely.
Signals are blocked during server requests. If you are indeed getting a SIGINT in the middle of a server call nothing can work, since the SIGINT handler will likely make a nested server call.
(Sorry in advance if this email sequences badly, Alexandre's email didn't come through on my mail client, so I'm replying blindly ;) ).
Hi Alexandre,
Signals are blocked during server requests. If you are indeed getting a SIGINT in the middle of a server call nothing can work, since the SIGINT handler will likely make a nested server call.
You are correct, my apologies. I used SIGINT as an example because I thought it would be the simplest case, but it is indeed not an issue here, because SIGINT is blocked at this point, which I missed. However, the patch is still relevant because syscall interruptions can happen for a number of other reasons. Off the top of my head:
- SIGSTOP/SIGCONT will cause syscall interruptions - As will a cgroup freeze - As I mentioned in the original email, various profile and debug tooling like to cause syscall restarts.
My particular case was the third one since I was trying to debug some crashes we're seeing under wine, and the debug tooling caused additional syscall restarts, which caused wine to be upset.
Keno
On Wed, Dec 15, 2021 at 5:22 AM Keno Fischer keno@juliacomputing.com wrote:
It is possible for the write/writev functions in send_request to return short writes, even in non-error conditions. For example, an unfortunately timed SIGINT after the first 4096 bytes of a pipe transfer will interrupt the syscall and return. In some cases (in particular when no bytes have been transferred at all), the SA_RESTART spec on the signal handler will take care of the restart, but once any bytes have been transferred, the result will be a short write and no automatic restart. Some linux profiling and debugging tooling can also increase the rate of signal interruptions making this corner case more likely.
Make wine robust to this corner case by properly restarting a short write with adjusted buffers.
dlls/ntdll/unix/server.c | 38 +++++++++++++++++++++++++++++++------- 1 file changed, 31 insertions(+), 7 deletions(-)
diff --git a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c index a388247beb2..a848328f2a9 100644 --- a/dlls/ntdll/unix/server.c +++ b/dlls/ntdll/unix/server.c @@ -183,13 +183,22 @@ static DECLSPEC_NORETURN void server_protocol_perror( const char *err ) static unsigned int send_request( const struct __server_request_info *req ) { unsigned int i;
- int ret;
int ret = 0;
int to_write = sizeof(req->u.req) + req->u.req.request_header.request_size; if (!req->u.req.request_header.request_size) {
if ((ret = write( ntdll_get_thread_data()->request_fd, &req->u.req,
sizeof(req->u.req) )) == sizeof(req->u.req)) return STATUS_SUCCESS;
const char *write_ptr = (const char *)&req->u.req;
for (;;) {
ret = write( ntdll_get_thread_data()->request_fd, (void*)write_ptr,
to_write );
if (ret == to_write) return STATUS_SUCCESS;
else if (ret < 0) break;
// Short write (e.g. due to an unfortunately timed SIGINT) - not
// an error. Adjust remaining write length and buffer and start again.
to_write -= ret;
write_ptr += ret;
} else {}
@@ -202,11 +211,26 @@ static unsigned int send_request( const struct __server_request_info *req ) vec[i+1].iov_base = (void *)req->data[i].ptr; vec[i+1].iov_len = req->data[i].size; }
if ((ret = writev( ntdll_get_thread_data()->request_fd, vec, i+1 )) ==
req->u.req.request_header.request_size + sizeof(req->u.req)) return STATUS_SUCCESS;
for (;;) {
ret = writev( ntdll_get_thread_data()->request_fd, vec, i+1 );
if (ret == to_write) return STATUS_SUCCESS;
else if (ret < 0) break;
// Short write as above. Adjust buffer lengths and start again.
to_write -= ret;
for (unsigned int j = 0; j < i+1; j++) {
if (ret >= vec[j].iov_len) {
ret -= vec[j].iov_len;
vec[j].iov_len = 0;
} else {
vec[j].iov_base = (char *)vec[j].iov_base + ret;
vec[j].iov_len -= ret;
break;
}
}
}}
- if (ret >= 0) server_protocol_error( "partial write %d\n", ret ); if (errno == EPIPE) abort_thread(0); if (errno == EFAULT) return STATUS_ACCESS_VIOLATION; server_protocol_perror( "write" );
-- 2.25.1