1/6:
It seems weird that we're keeping the "wait" object associated with an IOCP even when we're not waiting on it. That is, I'd expect us to remove the completion_wait from an IOCP as soon as a remove_completion call succeeds, or when the wait times out. I guess it works because wake_up() won't actually do anything if a thread isn't queued on its completion_wait, but it doesn't seem very declarative. Or maybe we need to do it like this for some reason I'm not thinking of?
Why do we keep destroying and recreating the completion_wait? I'd expect us to either destroy it immediately when it's not needed (cf. the above question), or keep it around for the lifetime of the thread and simply retarget it to a different IOCP.
In completion_destroy() you set wait->completion to NULL, which I believe will prevent the list_remove() in cleanup_thread_completion(), which doesn't look right. [Cheating and looking ahead, that's fixed in 4/6.]
2/6:
The subject of this patch suggests that it fixes the bug you mention (where posting a packet and then immediately closing the port often fails with STATUS_INVALID_HANDLE) but according to my reading that's actually fixed in 4/6 [whose title would not suggest this]. I'd advocate for fixing the split for the sake of future archaeology, but I may be overruled on this concern.
3/6:
Ah, so that explains why we need to keep the thread associated. I'm mildly curious if that holds if we wait on a different IOCP instead of the same one?
(This is one of those cases where it would have been helpful to order the tests before the fixes. Probably no point changing that now, though.)
Why the "ok(count <= 1)"? Does this differ between Windows versions, and, if so, can we document this in the code please?