-- v6: ntdll: Raise SIGQUIT instead of aborting immediately if the server pipe is broken.
From: Zebediah Figura zfigura@codeweavers.com
Keil UV4 C51 starts a thread and then immediately terminates it, usually while it's in the middle of an NtClose(). This results in the thread dying while holding the fd_cache_handle mutex, which results in a hang from other threads in the process.
We don't need to process SIGQUIT immediately, so just mask it off until we're out of the uninterrupted section.
This will require the next commit to fix the bug.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54807 --- dlls/ntdll/unix/server.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c index 69e6567c911..c512c1002d5 100644 --- a/dlls/ntdll/unix/server.c +++ b/dlls/ntdll/unix/server.c @@ -1579,6 +1579,7 @@ size_t server_init_process(void) sigaddset( &server_block_set, SIGIO ); sigaddset( &server_block_set, SIGINT ); sigaddset( &server_block_set, SIGHUP ); + sigaddset( &server_block_set, SIGQUIT ); sigaddset( &server_block_set, SIGUSR1 ); sigaddset( &server_block_set, SIGUSR2 ); sigaddset( &server_block_set, SIGCHLD );
From: Zebediah Figura zfigura@codeweavers.com
Keil UV4 C51 starts a thread and then immediately terminates it, usually while it's in the middle of an NtClose().
The previous commit prevents the SIGQUIT from killing the thread before it exits the fd_cache_section mutex. However, the server will still have closed its side of the socket, resulting in the close_handle server request receiving EPIPE from write(), or zero from read(), at which point the client calls abort_thread() anyway, while still holding the mutex.
Raise SIGQUIT instead, which will terminate the thread once it leaves the uninterrupted section, after releasing any relevant mutexes.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54807 --- dlls/ntdll/unix/server.c | 44 ++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 11 deletions(-)
diff --git a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c index c512c1002d5..e50a6e78f03 100644 --- a/dlls/ntdll/unix/server.c +++ b/dlls/ntdll/unix/server.c @@ -206,7 +206,14 @@ static unsigned int send_request( const struct __server_request_info *req ) }
if (ret >= 0) server_protocol_error( "partial write %d\n", ret ); - if (errno == EPIPE) abort_thread(0); + if (errno == EPIPE) + { + /* SIGQUIT might be blocked. We want to terminate ourselves, but only + * after signals have been unblocked (and e.g. mutexes have been + * released). */ + raise( SIGQUIT ); + return STATUS_THREAD_IS_TERMINATING; + } if (errno == EFAULT) return STATUS_ACCESS_VIOLATION; server_protocol_perror( "write" ); } @@ -217,7 +224,7 @@ static unsigned int send_request( const struct __server_request_info *req ) * * Read data from the reply buffer; helper for wait_reply. */ -static void read_reply_data( void *buffer, size_t size ) +static unsigned int read_reply_data( void *buffer, size_t size ) { int ret;
@@ -225,7 +232,7 @@ static void read_reply_data( void *buffer, size_t size ) { if ((ret = read( ntdll_get_thread_data()->reply_fd, buffer, size )) > 0) { - if (!(size -= ret)) return; + if (!(size -= ret)) return STATUS_SUCCESS; buffer = (char *)buffer + ret; continue; } @@ -234,8 +241,11 @@ static void read_reply_data( void *buffer, size_t size ) if (errno == EPIPE) break; server_protocol_perror("read"); } - /* the server closed the connection; time to die... */ - abort_thread(0); + /* SIGQUIT might be blocked. We want to terminate ourselves, but only + * after signals have been unblocked (and e.g. mutexes have been + * released). */ + raise( SIGQUIT ); + return STATUS_THREAD_IS_TERMINATING; }
@@ -246,9 +256,11 @@ static void read_reply_data( void *buffer, size_t size ) */ static inline unsigned int wait_reply( struct __server_request_info *req ) { - read_reply_data( &req->u.reply, sizeof(req->u.reply) ); - if (req->u.reply.reply_header.reply_size) - read_reply_data( req->reply_data, req->u.reply.reply_header.reply_size ); + unsigned int ret; + ret = read_reply_data( &req->u.reply, sizeof(req->u.reply) ); + if (!ret && req->u.reply.reply_header.reply_size) + ret = read_reply_data( req->reply_data, req->u.reply.reply_header.reply_size ); + if (ret) return ret; return req->u.reply.reply_header.error; }
@@ -907,7 +919,14 @@ void wine_server_send_fd( int fd ) if ((ret = sendmsg( fd_socket, &msghdr, 0 )) == sizeof(data)) return; if (ret >= 0) server_protocol_error( "partial write %d\n", ret ); if (errno == EINTR) continue; - if (errno == EPIPE) abort_thread(0); + if (errno == EPIPE) + { + /* SIGQUIT might be blocked. We want to terminate ourselves, but + * only after signals have been unblocked (and e.g. mutexes have + * been released). */ + raise( SIGQUIT ); + return; + } server_protocol_perror( "sendmsg" ); } } @@ -968,8 +987,11 @@ static int receive_fd( obj_handle_t *handle ) if (errno == EPIPE) break; server_protocol_perror("recvmsg"); } - /* the server closed the connection; time to die... */ - abort_thread(0); + /* SIGQUIT might be blocked. We want to terminate ourselves, but only + * after signals have been unblocked (and e.g. mutexes have been + * released). */ + raise( SIGQUIT ); + return -1; }
You changed `wine_server_send_fd`. The change looks correct. OTOH, how about `receive_fd`?
I overlooked it somehow; addressed now. Thanks for catching that.
This merge request was closed by Elizabeth Figura.
From off-list conversation, this isn't going to be enough to solve the general problem of thread state becoming inconsistent when killed while in a syscall, probably mostly because of win32u problems. This needs more thought.