[PATCH v4 0/2] MR4811: ntdll: Avoid hanging when a thread is terminated while holding a pthread mutex.

List overview All Threads

newer

older

[PATCH v3 0/1] MR4832: ntdll:...

[PATCH v2 0/6] MR561: vkd3d-utils:...

Zebediah Figura (＠zfigura)

10 Jan 2024 10 Jan '24

6:22 p.m.

-- v4: ntdll: Raise SIGQUIT instead of aborting immediately if the server pipe is broken.

https://gitlab.winehq.org/wine/wine/-/merge_requests/4811

Show replies by date

Zebediah Figura

10 Jan 10 Jan

6:22 p.m.

New subject: [PATCH v4 1/2] ntdll: Add SIGQUIT to server_block_set.

From: Zebediah Figura zfigura@codeweavers.com

Keil UV4 C51 starts a thread and then immediately terminates it, usually while it's in the middle of an NtClose(). This results in the thread dying while holding the fd_cache_handle mutex, which results in a hang from other threads in the process.

We don't need to process SIGQUIT immediately, so just mask it off until we're out of the uninterrupted section.

This will require the next commit to fix the bug.

Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54807 --- dlls/ntdll/unix/server.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c index 53fcc61ccf3..057efb5e917 100644 --- a/dlls/ntdll/unix/server.c +++ b/dlls/ntdll/unix/server.c @@ -1579,6 +1579,7 @@ size_t server_init_process(void) sigaddset( &server_block_set, SIGIO ); sigaddset( &server_block_set, SIGINT ); sigaddset( &server_block_set, SIGHUP ); + sigaddset( &server_block_set, SIGQUIT ); sigaddset( &server_block_set, SIGUSR1 ); sigaddset( &server_block_set, SIGUSR2 ); sigaddset( &server_block_set, SIGCHLD );

-- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/4811

Zebediah Figura

6:22 p.m.

New subject: [PATCH v4 2/2] ntdll: Raise SIGQUIT instead of aborting immediately if the server pipe is broken.

From: Zebediah Figura zfigura@codeweavers.com

Keil UV4 C51 starts a thread and then immediately terminates it, usually while it's in the middle of an NtClose().

The previous commit prevents the SIGQUIT from killing the thread before it exits the fd_cache_section mutex. However, the server will still have closed its side of the socket, resulting in the close_handle server request receiving EPIPE from write(), or zero from read(), at which point the client calls abort_thread() anyway, while still holding the mutex.

Raise SIGQUIT instead, which will terminate the thread once it leaves the uninterrupted section, after releasing any relevant mutexes.

Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54807 --- dlls/ntdll/unix/server.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c index 057efb5e917..df4ead87115 100644 --- a/dlls/ntdll/unix/server.c +++ b/dlls/ntdll/unix/server.c @@ -206,7 +206,14 @@ static unsigned int send_request( const struct __server_request_info *req ) }

if (ret >= 0) server_protocol_error( "partial write %d\n", ret ); - if (errno == EPIPE) abort_thread(0); + if (errno == EPIPE) + { + /* SIGQUIT might be blocked. We want to terminate ourselves, but only + * after signals have been unblocked (and e.g. mutexes have been + * released). */ + raise( SIGQUIT ); + return STATUS_THREAD_IS_TERMINATING; + } if (errno == EFAULT) return STATUS_ACCESS_VIOLATION; server_protocol_perror( "write" ); } @@ -217,7 +224,7 @@ static unsigned int send_request( const struct __server_request_info *req ) * * Read data from the reply buffer; helper for wait_reply. */ -static void read_reply_data( void *buffer, size_t size ) +static unsigned int read_reply_data( void *buffer, size_t size ) { int ret;

@@ -225,7 +232,7 @@ static void read_reply_data( void *buffer, size_t size ) { if ((ret = read( ntdll_get_thread_data()->reply_fd, buffer, size )) > 0) { - if (!(size -= ret)) return; + if (!(size -= ret)) return STATUS_SUCCESS; buffer = (char *)buffer + ret; continue; } @@ -234,8 +241,11 @@ static void read_reply_data( void *buffer, size_t size ) if (errno == EPIPE) break; server_protocol_perror("read"); } - /* the server closed the connection; time to die... */ - abort_thread(0); + /* SIGQUIT might be blocked. We want to terminate ourselves, but only + * after signals have been unblocked (and e.g. mutexes have been + * released). */ + raise( SIGQUIT ); + return STATUS_THREAD_IS_TERMINATING; }

@@ -246,9 +256,11 @@ static void read_reply_data( void *buffer, size_t size ) */ static inline unsigned int wait_reply( struct __server_request_info *req ) { - read_reply_data( &req->u.reply, sizeof(req->u.reply) ); - if (req->u.reply.reply_header.reply_size) - read_reply_data( req->reply_data, req->u.reply.reply_header.reply_size ); + unsigned int ret; + ret = read_reply_data( &req->u.reply, sizeof(req->u.reply) ); + if (!ret && req->u.reply.reply_header.reply_size) + ret = read_reply_data( req->reply_data, req->u.reply.reply_header.reply_size ); + if (ret) return ret; return req->u.reply.reply_header.error; }

-- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/4811

Zebediah Figura (＠zfigura)

6:36 p.m.

...

...
I think calling `raise(SIGQUIT);` just before returning `STATUS_THREAD_IS_TERMINATING` will fix all that, while still achieving your purpose. If the server has already sent SIGQUIT, this has virtually no effect; otherwise, the signal will be delievered once we unblock it later.

That probably makes sense, though.

Done in v3.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/4811#note_57068

Jinoh Kang (＠iamahuman)

11 Jan 11 Jan

1:26 p.m.

New subject: [PATCH v4 0/2] MR4811: ntdll: Avoid hanging when a thread is terminated while holding a pthread mutex. - approved

This merge request was approved by Jinoh Kang.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/4811

Jinoh Kang (＠iamahuman)

17 Jan 17 Jan

3:40 p.m.

On Wed Jan 17 15:40:53 2024 +0000, Zebediah Figura wrote:

...

...
My problem is processes that *continue* to do something externally

visible without assistance from the server. A nontrivial portion of syscalls doesn't actually require the server to work; for example, NtWriteFile() can work with fd cache alone as long as it doesn't need to send completion to the server. I don't think this is a meaningful concern?

...
Other syscalls that *do* contact server will just return

`STATUS_THREAD_IS_TERMINATING`, which has ample possibility to confuse the app which won't voluntarily terminate by itself and will continue doing something weird externally.

...
I think calling `raise(SIGQUIT);` just before returning

`STATUS_THREAD_IS_TERMINATING` will fix all that, while still achieving your purpose. If the server has already sent SIGQUIT, this has virtually no effect; otherwise, the signal will be delievered once we unblock it later. That probably makes sense, though.

...
`receive_fd` is called while `fd_cache_mutex` is held and has

`abort_thread(0);` too; maybe it needs similar treatment? It does, but it doesn't affect this application. I don't know if should be added to the series for code freeze in that case.

We're past the code freeze now. Do you think this is worth revisiting?

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/4811#note_57708

642

Age (days ago)

649

Last active (days ago)

wine-gitlab@winehq.org

5 comments

3 participants

tags (0)

participants (3)

Jinoh Kang (＠iamahuman)
Zebediah Figura
Zebediah Figura (＠zfigura)