On Mon Sep 1 17:50:28 2025 +0000, Matteo Bruni wrote:
I'm attaching some new ntoskrnl tests which pass for me on up to date Windows 10 and 11. There's room for cleanup but I think they answer most of the questions you had; I'm curious to know if you think this is convincing enough or otherwise you'd like some other piece tested. Or anything else, really (e.g. maybe I'm taking the wrong conclusions from those test results...) It looks like only `CancelIo()`, not `CancelIoEx()`, waits for the operations to complete. It does wait for completion even if the operations aren't actually cancelled but eventually end regularly on their own. On the other hand, `CancelSynchronousIo()` really wants the completion routine itself to call `IoCompleteRequest()`. The values set in the IOSB `Status` and `Information` fields are apparently not particularly important, or at least nothing breaks immediately if those are set to unexpected values. The thread ID check I added in `cancel_ioctl_irp()` does seem to strongly suggest that the cancel routine is called synchronously. Not sure if this test in particular is "safe" for upstream. Assuming all this makes sense, it looks to me that the approach from this MR is okay once I limit it to `CancelIo()`, leaving `CancelIoEx()` to the old mechanism. Luckily there is no concern with multiple "cancels" having to wait on the same operations. I guess I could also use `async_cancel` for `CancelSynchronousIo()` and get closer to native, although that's never going to be quite right until the whole async cancellation / completion becomes more synchronous. [ntoskrnl-cancel-tests.txt](/uploads/6067830a1d7b7355df5c048a5d67c4df/ntoskrnl-cancel-tests.txt)
I just pushed a new revision with a bunch of changes, including those already discussed. I didn't touch `CancelSynchronousIo()`'s implementation at all.
In the end I also essentially reverted the change splitting cancelled asyncs to separate lists. That means some trickery in `cancel_process_async()` becomes necessary to avoid looping forever, but it seems a positive overall as it cuts a lot of looping around the various lists.
WRT the old ntdll tests, now I'm not checking for completion after `CancelIoEx()` at all. That in practice seems to happen consistently on Windows, probably because the cancel routine is called synchronously. I could restore those checks and mark them as `todo_wine` (or maybe even `flaky_wine`) if preferred.