Inspired by the discussion for MR 127 (https://www.winehq.org/pipermail/wine-devel/2022-May/217832.html) and is aimed to replace the second patch (ws2_32: Use allocated IO status block in select().) in that MR.
When the thread calls server's terminate_thread for itself queuing the system APC though SIGUSR1 might race with the thread exit, there seems no easy and reliable way to be sure that signal is processed before the thread's stack and TEB are freed. Thus I am making the system APCs to be queued to the thread to be processed by server_select() after (terminate_thread) server call.
When terminate_thread is called for the other thread the system APC will always be delivered through some other thread. Thread stack and TEB are not freed when the thread is killed externally, so completing the async from another thread should probably be fine.
-- v3: ntdll: Cancel asyncs when thread is terminated. ws2_32/tests: Add tests for terminated thread asyncs completion.