This fixes a range of completion port scheduling behaviour.
The following issues fixed by these patches are encountered in various games: - Closing the last port handle should result in both direct waiters on port being woken (with success status) and the waiters through NtRemoveIoComletion[Ex] being woken with abandoned wait status (while GetQueuedCompletionStatus[Ex] returnes a error in that case which is not mapped by default; the correct error there is also important at least for Planet Zoo); - If one thread does PostQueuedCompletionStatus immediately followed by CloseHandle for port, a thread which is waiting for the completion in NtRemoveIoComletion[Ex] will currently get STATUS_INVALID_HANDLE most of the time. As after the direct wait on the port handle is satisfied with completion the port is closed and looping (remove_completion) will be called with already invalid handle. - In a scenario when the work scheduling is performed with worker threads waiting for completions, there is currently severe excessive load for the server resulting from the worker threads getting wait satisfied but another thread stealing the completion before it gets again to (remove_completion). That results in lot of extra server calls.
The following differences in behaviour with Windows also fixed with the patches. For these ones I don't know for sure if anything depends on that specifically but it looked sensible to do that right at once if redoing completion port scheduling: - The waiting threads should be woken in LIFO order (and a new NtRemoveIoCompletion request should not steal completion from already woken thread); - All the app threads waiting on completion port directly should be woken if a new completion (which is not immediately assigned to a "normal" waiter for completion) is added; if the completion is to be consumed by waiting NtRemoveIoCompletion no direct waiters are woken. While I don't know anything that depends on all of those details there was a game which does wait on completion port directly together with some other sync object. Also currently a satisfied direct wait on completion port will still wake up from NtRemoveIoCompletion; - When a thread is not yet assigned to completion port, present APC will take precedence over existing completion in NtRemoveIoCompletionEx. I don't know that it has practical importance, my motivation under "ntdll: Handle user APCs explicitly in NtRemoveIoCompletionEx()." is removing extra wait call for NtRemoveIoCompletionEx with zero timeout (which AFAIK is rather common case to check for present completions).