Inspired by the discussion for MR 127 (https://www.winehq.org/pipermail/wine-devel/2022-May/217832.html) and is aimed to replace the second patch (ws2_32: Use allocated IO status block in select().) in that MR.
When the thread calls server's terminate_thread for itself queuing the system APC though SIGUSR1 might race with the thread exit, there seems no easy and reliable way to be sure that signal is processed before the thread's stack and TEB are freed. Thus I am making the system APCs to be queued to the thread to be processed by server_select() after (terminate_thread) server call.
When terminate_thread is called for the other thread the system APC will always be delivered through some other thread. Thread stack and TEB are not freed when the thread is killed externally, so completing the async from another thread should probably be fine.
-- v3: ntdll: Cancel asyncs when thread is terminated. ws2_32/tests: Add tests for terminated thread asyncs completion.
From: Paul Gofman pgofman@codeweavers.com
Signed-off-by: Paul Gofman pgofman@codeweavers.com --- dlls/ws2_32/tests/afd.c | 159 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 159 insertions(+)
diff --git a/dlls/ws2_32/tests/afd.c b/dlls/ws2_32/tests/afd.c index 5b64ccd7715..b8579a7c9f7 100644 --- a/dlls/ws2_32/tests/afd.c +++ b/dlls/ws2_32/tests/afd.c @@ -2355,6 +2355,164 @@ static void test_getsockname(void) CloseHandle(event); }
+struct ioctl_params +{ + HANDLE handle, event; + PIO_APC_ROUTINE apc; + void *apc_context; + IO_STATUS_BLOCK *io; + ULONG code; + void *in_buffer; + ULONG in_size; + void *out_buffer; + ULONG out_size; + NTSTATUS ret; + HANDLE complete_event; + BOOL kill_thread; +}; + +static DWORD WINAPI async_ioctl_thread(void *params) +{ + struct ioctl_params *io = params; + + io->ret = NtDeviceIoControlFile(io->handle, io->event, io->apc, io->apc_context, io->io, + io->code, io->in_buffer, io->in_size, io->out_buffer, io->out_size); + SetEvent(io->complete_event); + if (io->kill_thread) + Sleep(3000); + return io->ret; +} + +static NTSTATUS WINAPI thread_NtDeviceIoControlFile(BOOL kill_thread, HANDLE handle, HANDLE event, + PIO_APC_ROUTINE apc, void *apc_context, IO_STATUS_BLOCK *io, ULONG code, void *in_buffer, ULONG in_size, + void *out_buffer, ULONG out_size) +{ + struct ioctl_params p; + HANDLE thread; + DWORD ret; + + p.handle = handle; + p.event = event; + p.apc = apc; + p.apc_context = apc_context; + p.io = io; + p.code = code; + p.in_buffer = in_buffer; + p.in_size = in_size; + p.out_buffer = out_buffer; + p.out_size = out_size; + p.complete_event = CreateEventW(NULL, FALSE, FALSE, NULL); + p.kill_thread = kill_thread; + + thread = CreateThread(NULL, 0, async_ioctl_thread, &p, 0, NULL); + ok(!!thread, "got NULL.\n"); + ret = WaitForSingleObject(p.complete_event, INFINITE); + ok(ret == WAIT_OBJECT_0, "got ret %#lx.\n", ret); + if (kill_thread) + TerminateThread(thread, -1); + CloseHandle(p.complete_event); + ret = WaitForSingleObject(thread, INFINITE); + ok(ret == WAIT_OBJECT_0, "got ret %#lx.\n", ret); + CloseHandle(thread); + return p.ret; +} + +static void test_async_thread_termination(void) +{ + const struct sockaddr_in bind_addr = {.sin_family = AF_INET, .sin_addr.s_addr = htonl(INADDR_LOOPBACK)}; + char in_buffer[offsetof(struct afd_poll_params, sockets[3])]; + char out_buffer[offsetof(struct afd_poll_params, sockets[3])]; + struct afd_poll_params *in_params = (struct afd_poll_params *)in_buffer; + struct afd_poll_params *out_params = (struct afd_poll_params *)out_buffer; + LARGE_INTEGER zero = {{0}}; + ULONG_PTR key, value; + IO_STATUS_BLOCK io; + HANDLE event, port; + ULONG params_size; + SOCKET listener; + int ret; + + event = CreateEventW(NULL, FALSE, FALSE, NULL); + + listener = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); + ret = bind(listener, (const struct sockaddr *)&bind_addr, sizeof(bind_addr)); + ok(!ret, "got error %u\n", WSAGetLastError()); + ret = listen(listener, 1); + ok(!ret, "got error %u\n", WSAGetLastError()); + + in_params->count = 1; + in_params->exclusive = FALSE; + in_params->sockets[0].socket = listener; + in_params->sockets[0].flags = ~0; + in_params->sockets[0].status = 0xdeadbeef; + params_size = offsetof(struct afd_poll_params, sockets[1]); + in_params->timeout = -10 * 1000 * 1000 * 5; + + memset(&io, 0xcc, sizeof(io)); + ret = thread_NtDeviceIoControlFile(TRUE, (HANDLE)listener, event, NULL, NULL, &io, + IOCTL_AFD_POLL, in_params, params_size, out_params, params_size); + ok(ret == STATUS_PENDING, "got %#x\n", ret); + ret = WaitForSingleObject(event, 1000); + todo_wine ok(!ret, "got %#x\n", ret); + todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + + memset(&io, 0xcc, sizeof(io)); + ret = thread_NtDeviceIoControlFile(FALSE, (HANDLE)listener, event, NULL, NULL, &io, + IOCTL_AFD_POLL, in_params, params_size, out_params, params_size); + ok(ret == STATUS_PENDING, "got %#x\n", ret); + ret = WaitForSingleObject(event, 1000); + todo_wine ok(!ret, "got %#x\n", ret); + todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + + port = CreateIoCompletionPort((HANDLE)listener, NULL, 0, 0); + + memset(&io, 0xcc, sizeof(io)); + ret = thread_NtDeviceIoControlFile(TRUE, (HANDLE)listener, event, NULL, (void *)0xdeadbeef, &io, + IOCTL_AFD_POLL, in_params, params_size, out_params, params_size); + ok(ret == STATUS_PENDING, "got %#x\n", ret); + + ret = WaitForSingleObject(event, 1000); + todo_wine ok(!ret, "got %#x\n", ret); + todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + + memset(&io, 0xcc, sizeof(io)); + key = 0xcc; + value = 0; + ret = NtRemoveIoCompletion(port, &key, &value, &io, &zero); + todo_wine + { + ok(!ret, "got %#x\n", ret); + ok(!key, "got key %#Ix\n", key); + ok(value == 0xdeadbeef, "got value %#Ix\n", value); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + } + + memset(&io, 0xcc, sizeof(io)); + ret = thread_NtDeviceIoControlFile(FALSE, (HANDLE)listener, event, NULL, (void *)0xdeadbeef, &io, + IOCTL_AFD_POLL, in_params, params_size, out_params, params_size); + ok(ret == STATUS_PENDING, "got %#x\n", ret); + + ret = WaitForSingleObject(event, 1000); + todo_wine ok(!ret, "got %#x\n", ret); + todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + + memset(&io, 0xcc, sizeof(io)); + key = 0xcc; + value = 0; + ret = NtRemoveIoCompletion(port, &key, &value, &io, &zero); + todo_wine + { + ok(!ret, "got %#x\n", ret); + ok(!key, "got key %#Ix\n", key); + ok(value == 0xdeadbeef, "got value %#Ix\n", value); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + } + + CloseHandle(port); + CloseHandle(event); + closesocket(listener); +} + START_TEST(afd) { WSADATA data; @@ -2372,6 +2530,7 @@ START_TEST(afd) test_get_events_reset(); test_bind(); test_getsockname(); + test_async_thread_termination();
WSACleanup(); }
From: Paul Gofman pgofman@codeweavers.com
Signed-off-by: Paul Gofman pgofman@codeweavers.com --- dlls/ws2_32/tests/afd.c | 38 ++++++++++++++++---------------------- server/async.c | 5 +++++ server/file.h | 1 + server/thread.c | 1 + 4 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/dlls/ws2_32/tests/afd.c b/dlls/ws2_32/tests/afd.c index b8579a7c9f7..de042358297 100644 --- a/dlls/ws2_32/tests/afd.c +++ b/dlls/ws2_32/tests/afd.c @@ -2453,16 +2453,16 @@ static void test_async_thread_termination(void) IOCTL_AFD_POLL, in_params, params_size, out_params, params_size); ok(ret == STATUS_PENDING, "got %#x\n", ret); ret = WaitForSingleObject(event, 1000); - todo_wine ok(!ret, "got %#x\n", ret); - todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + ok(!ret, "got %#x\n", ret); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status);
memset(&io, 0xcc, sizeof(io)); ret = thread_NtDeviceIoControlFile(FALSE, (HANDLE)listener, event, NULL, NULL, &io, IOCTL_AFD_POLL, in_params, params_size, out_params, params_size); ok(ret == STATUS_PENDING, "got %#x\n", ret); ret = WaitForSingleObject(event, 1000); - todo_wine ok(!ret, "got %#x\n", ret); - todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + ok(!ret, "got %#x\n", ret); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status);
port = CreateIoCompletionPort((HANDLE)listener, NULL, 0, 0);
@@ -2472,20 +2472,17 @@ static void test_async_thread_termination(void) ok(ret == STATUS_PENDING, "got %#x\n", ret);
ret = WaitForSingleObject(event, 1000); - todo_wine ok(!ret, "got %#x\n", ret); - todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + ok(!ret, "got %#x\n", ret); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status);
memset(&io, 0xcc, sizeof(io)); key = 0xcc; value = 0; ret = NtRemoveIoCompletion(port, &key, &value, &io, &zero); - todo_wine - { - ok(!ret, "got %#x\n", ret); - ok(!key, "got key %#Ix\n", key); - ok(value == 0xdeadbeef, "got value %#Ix\n", value); - ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); - } + ok(!ret, "got %#x\n", ret); + ok(!key, "got key %#Ix\n", key); + ok(value == 0xdeadbeef, "got value %#Ix\n", value); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status);
memset(&io, 0xcc, sizeof(io)); ret = thread_NtDeviceIoControlFile(FALSE, (HANDLE)listener, event, NULL, (void *)0xdeadbeef, &io, @@ -2493,20 +2490,17 @@ static void test_async_thread_termination(void) ok(ret == STATUS_PENDING, "got %#x\n", ret);
ret = WaitForSingleObject(event, 1000); - todo_wine ok(!ret, "got %#x\n", ret); - todo_wine ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); + ok(!ret, "got %#x\n", ret); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status);
memset(&io, 0xcc, sizeof(io)); key = 0xcc; value = 0; ret = NtRemoveIoCompletion(port, &key, &value, &io, &zero); - todo_wine - { - ok(!ret, "got %#x\n", ret); - ok(!key, "got key %#Ix\n", key); - ok(value == 0xdeadbeef, "got value %#Ix\n", value); - ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status); - } + ok(!ret, "got %#x\n", ret); + ok(!key, "got key %#Ix\n", key); + ok(value == 0xdeadbeef, "got value %#Ix\n", value); + ok(io.Status == STATUS_CANCELLED, "got %#lx\n", io.Status);
CloseHandle(port); CloseHandle(event); diff --git a/server/async.c b/server/async.c index a4fbeab555e..57baef68137 100644 --- a/server/async.c +++ b/server/async.c @@ -593,6 +593,11 @@ void cancel_process_asyncs( struct process *process ) cancel_async( process, NULL, NULL, 0 ); }
+void cancel_terminating_thread_asyncs( struct thread *thread ) +{ + cancel_async( thread->process, NULL, thread, 0 ); +} + /* wake up async operations on the queue */ void async_wake_up( struct async_queue *queue, unsigned int status ) { diff --git a/server/file.h b/server/file.h index 9f9d4cd4e1a..0ffe0e2c8dc 100644 --- a/server/file.h +++ b/server/file.h @@ -245,6 +245,7 @@ extern struct iosb *async_get_iosb( struct async *async ); extern struct thread *async_get_thread( struct async *async ); extern struct async *find_pending_async( struct async_queue *queue ); extern void cancel_process_asyncs( struct process *process ); +extern void cancel_terminating_thread_asyncs( struct thread *thread );
static inline void init_async_queue( struct async_queue *queue ) { diff --git a/server/thread.c b/server/thread.c index 467ccd1f0db..f49fbf40b78 100644 --- a/server/thread.c +++ b/server/thread.c @@ -1462,6 +1462,7 @@ DECL_HANDLER(terminate_thread) thread->exit_code = req->exit_code; if (thread != current) kill_thread( thread, 1 ); else reply->self = 1; + cancel_terminating_thread_asyncs( thread ); release_object( thread ); } }
I simplified the thing greatly but not caring about specific synchronization of the async cancel APCs delivery. The reasoning behind that is that in the normal case (when thread kill facility and unix_tid is available) the signal should be delivered and started processing before we return from server call. For the potential corner cases this hopefully won't be a regression because those asyncs anyway can now complete and be delivered when thread is dead already.
On 6/3/22 15:12, Paul Gofman (@gofman) wrote:
Inspired by the discussion for MR 127 (https://www.winehq.org/pipermail/wine-devel/2022-May/217832.html) and is aimed to replace the second patch (ws2_32: Use allocated IO status block in select().) in that MR.
When the thread calls server's terminate_thread for itself queuing the system APC though SIGUSR1 might race with the thread exit, there seems no easy and reliable way to be sure that signal is processed before the thread's stack and TEB are freed. Thus I am making the system APCs to be queued to the thread to be processed by server_select() after (terminate_thread) server call.
When terminate_thread is called for the other thread the system APC will always be delivered through some other thread. Thread stack and TEB are not freed when the thread is killed externally, so completing the async from another thread should probably be fine.
So, to be clear, in the offending program:
(1) thread terminates itself via NtTerminateThread();
(2) thread calls exit_thread(), which means that its TEB and stack are queued to be freed but are not actually freed yet;
(3) thread closes its server socket, causing the server to notice and kill_thread( thread, 0 );
(4) another thread exits, and the TEB of the first thread is freed
(5) select async is eventually signaled (we never cancel it, although we should), and the server ends up queueing SIGUSR1 to some random thread, since the original thread is terminated
(6) APC_ASYNC_IO handler in said random thread tries to access the original thread's stack and crashes
Do I have that right?
So, with that said, thoughts:
* If a thread dies violently, we can cancel its asyncs from the server side [in kill_thread(), probably? or the terminate_thread handler; see below], and we don't need to worry about them accessing the stack, since the stack is never freed.
* If a thread dies naturally (terminates itself), we can cancel its asyncs from the terminate_thread handler, and then, in the client side but before calling exit_thread(), do a zero-length wait with SELECT_INTERRUPTIBLE, exactly as in v1 of this patch set. My understanding is that these alone are actually sufficient to ensure that any APC_ASYNC_IO resulting from the async termination are executed by the thread itself before it frees its stack. I.e. we don't need any of the "queue_only" bits that were in v1. This holds because:
- terminate_thread doesn't set the state to TERMINATED yet in the "self" case;
- as a result queue_apc() will queue the (system) APC to that specific thread;
- server_select( SELECT_INTERRUPTIBLE ) is actually sufficient to process the entire APC queue. It doesn't matter if SIGUSR1 is sent (and in this case it will always be—we're in the middle of a server request but not a server *select* at the point the APC is queued), SIGUSR1 is only needed to wake up a thread that isn't otherwise paying attention.
* There are some other potential bugs in this area I noticed while looking around (e.g. we should probably requeue system APCs of a terminated thread to another thread instead of dropping them on the floor, also apparently cancellation should always wait), but those seem orthogonal to the problem here and so can wait.
So to fix the bug it should be sufficient to
(a) cancel all thread asyncs in the terminate_thread handler,
(b) server_select( SELECT_INTERRUPTIBLE ) before calling exit_thread()
Does that sound right? Am I missing anything in my logic above?
On 6/3/22 20:30, Zebediah Figura wrote:
On 6/3/22 15:12, Paul Gofman (@gofman) wrote:
So, to be clear, in the offending program:
(1) thread terminates itself via NtTerminateThread();
(2) thread calls exit_thread(), which means that its TEB and stack are queued to be freed but are not actually freed yet;
(3) thread closes its server socket, causing the server to notice and kill_thread( thread, 0 );
(4) another thread exits, and the TEB of the first thread is freed
(5) select async is eventually signaled (we never cancel it, although we should), and the server ends up queueing SIGUSR1 to some random thread, since the original thread is terminated
(6) APC_ASYNC_IO handler in said random thread tries to access the original thread's stack and crashes
Do I have that right?
I didn't debug every bit of how and at which moment the exact the stack is freed, is that important? What I observed was:
a) Main thread queues user APC to the network thread;
b) Once that APC is executed (requires alertable wait in select() and recv()), it ends up being terminated normally, ending up in NtTerminateThread for itself;
c) Apparently here, between b) and d) the thread ends up being freed;
d) - (5), (6) from your list.
So AFAIU it matches the sequence you describe.
So, with that said, thoughts:
- If a thread dies violently, we can cancel its asyncs from the server
side [in kill_thread(), probably? or the terminate_thread handler; see below], and we don't need to worry about them accessing the stack, since the stack is never freed.
Yes
- If a thread dies naturally (terminates itself), we can cancel its
asyncs from the terminate_thread handler, and then, in the client side but before calling exit_thread(), do a zero-length wait with SELECT_INTERRUPTIBLE, exactly as in v1 of this patch set. My understanding is that these alone are actually sufficient to ensure that any APC_ASYNC_IO resulting from the async termination are executed by the thread itself before it frees its stack. I.e. we don't need any of the "queue_only" bits that were in v1. This holds because:
- terminate_thread doesn't set the state to TERMINATED yet in the "self" case;
Yes.
- as a result queue_apc() will queue the (system) APC to that specific thread;
It will not put it to the queue, it will send SIGUSR1. As (without the long and probably ugly part which was in v1 of the patch) it puts the APC in queue only if the thread is waiting in server select or already has some APC in queue. The other way (probably not super nice as well) would be to put APC_NONE to the thread system APC queue, then the logic in queue_apc() would queue our async cancels to the queue as well.
- server_select( SELECT_INTERRUPTIBLE ) is actually sufficient to process the entire APC queue. It doesn't matter if SIGUSR1 is sent (and in this case it will always be—we're in the middle of a server request but not a server *select* at the point the APC is queued), SIGUSR1 is only needed to wake up a thread that isn't otherwise paying attention.
server_select( SELECT_INTERRUPTIBLE ) is sufficient, but due to the above it will be redundant in most of the practical cases as the server would already have sent the USR1;
- There are some other potential bugs in this area I noticed while
looking around (e.g. we should probably requeue system APCs of a terminated thread to another thread instead of dropping them on the floor, also apparently cancellation should always wait), but those seem orthogonal to the problem here and so can wait.
So to fix the bug it should be sufficient to
(a) cancel all thread asyncs in the terminate_thread handler,
(b) server_select( SELECT_INTERRUPTIBLE ) before calling exit_thread()
As I understand, in the logic on v1 patch (which I believe is a bit more robust albeit much more convoluted) we also need to make sure that the APCs get to the queue and not USR1 (for the 'self' terminating thread). In the logic of v2 server_select() is redundant, the thread will get USR1 and process it itself before returning from server call except for some theoretical corner cases (like, absent thread kill syscall when the syscall will be sent to the whole process and thus random thread).
On 6/3/22 20:46, Paul Gofman wrote:
- If a thread dies naturally (terminates itself), we can cancel its
asyncs from the terminate_thread handler, and then, in the client side but before calling exit_thread(), do a zero-length wait with SELECT_INTERRUPTIBLE, exactly as in v1 of this patch set. My understanding is that these alone are actually sufficient to ensure that any APC_ASYNC_IO resulting from the async termination are executed by the thread itself before it frees its stack. I.e. we don't need any of the "queue_only" bits that were in v1. This holds because:
- terminate_thread doesn't set the state to TERMINATED yet in the "self" case;
Yes.
- as a result queue_apc() will queue the (system) APC to that specific thread;
It will not put it to the queue, it will send SIGUSR1. As (without the long and probably ugly part which was in v1 of the patch) it puts the APC in queue only if the thread is waiting in server select or already has some APC in queue. The other way (probably not super nice as well) would be to put APC_NONE to the thread system APC queue, then the logic in queue_apc() would queue our async cancels to the queue as well.
I don't understand what you mean by this. Sending SIGUSR1 doesn't mean that the async isn't queued; it's still put into the system_apc list. SIGUSR1 doesn't carry the async as a payload or anything, it's just a way to force the thread to check its own queue. Am I missing something here?
The way I see it, the exiting thread will select manually, and it'll also be sent SIGUSR1. If SIGUSR1 comes first, that'll trigger a select, and the system APC(s) will be consumed. If the manual select comes first, the same thing happens, and then it doesn't matter whether the SIGUSR1 is received or if it comes too late and gets masked off.
I don't understand what you mean by this. Sending SIGUSR1 doesn't mean that the async isn't queued; it's still put into the system_apc list. SIGUSR1 doesn't carry the async as a payload or anything, it's just a way to force the thread to check its own queue. Am I missing something here?
Oops, sorry… no, it is me missing an obvious fact which I knew for years, something made me see the problem where there is not. I will update the patch.