I'd propose that the constituent conditions of each flag be refactored into separate state variables first. Paradoxically, it may even help combine two or more binary flags into an enum as the semantics become more simpler.
Probably, but it's not clear to me how exactly to improve them?
I'm yet to come with a concrete proposal, but the basic idea is that they'll be decomposed until some subset of components can be condensed into a phase enum (or the like).
We can already do this with `signaled` and `unknown_status`, for (a not really good) example.
Having a phase enumeration would be great if we can make it work, but I'm not sure that we can really decompose anything that currently exists. Decomposing either of these two would require introducing new flags. Granted, those might have simpler meaning, but...
- `direct_result`: from Wine codebase, `a flag if we're passing result directly from request instead of APC`.
The idea behind direct_result is that there are two paths for asyncs:
- Requests that can be filled immediately. The idea here is then that we can do this in two server requests: one to send and/or retrieve the data, then the client can set iosb and/or output buffers, and then the second server request is done to trigger completion after the iosb and output buffers have been filled. This is the "direct_result" case. The prototypical example is a named pipe ioctl. The first request varies, but as you've observed, it's basically anything that calls create_request_async(), and then the request handler needs to not pend the async.
Nit: "pend the async" may need some clarification. There are two possible interpretations to "pend an async":
- Queueing an async for deferred asynchronous completion.
- Simply calling `set_async_pending(async)`.
For case 2, `direct_result` can still be true if we terminate the async immediately afterwards. This is akin to calling `IoMarkIrpPending(Irp);` and immediately following it with a call to `IoCompleteRequest(Irp, IO_NO_INCREMENT);`.
I'll assume that you meant case 1 here.
Yes, that's what I meant.
- Requests that can't be filled immediately. In this case we use an APC to notify the client when the request is filled. This is the "!direct_result" case. The APC is a callback in a sense, but you can also look at it as two separate requests. As with the direct_result case, we need the server to return the iosb and output buffer (the APC arguments), then the client fills them, then it sends another request to the server (returning from the APC) that lets the server queue completion.
- The async was created via create_request_async(), and
The other path is register_async, and I think that should just go away. The idea of register_async is that the client tries to do i/o, gets EWOULDBLOCK, and then sends an async to the server. It was once used for sockets, and now I believe only applies to some "special" unix file types: serial, char device, mailslot, probably also FIFO. Besides being kind of redundant, it has a fundamental problem, as we've discovered with sockets: inauspiciously timed I/O on a "ready" file can jump ahead of queued I/O. I think in general we want to replace it with something that looks like the socket path.
I think that depends on whether the file object serializes I/O requests. If the file object can complete I/O requests in a nondeterministic order, then I don't think the approach merits extra round-tripping overhead.
Well, it's serial devices and mailslots, which can probably be tested on Windows, and char/FIFO devices, which can't. But honestly, I'm not sure testing is worth the trouble. It'd be impossible to prove for sure that Windows queues I/O, but I think it's also the only sensible thing to do.
Perhaps more importantly, I think the lesser number of paths we have for asyncs, the better. If we can eliminate one potential outlier by making them work like sockets, it'll be that much easier to work with the async logic.
Side note: what if `server_read_file` is blocked in an alertable wait for an async handle with `async->unknown_status == 1`, and the blocked thread is sent a user APC? I suppose the thread should not be alerted *until after* the driver returns from the dispatcher.
Yeah, that part isn't really correct as-is. Perhaps more importantly, if the async returns success I don't think the thread should be alerted at all. It's probably not a huge deal, though, since FILE_SYNCHRONOUS_IO_ALERT is not actually exposed through kernel32.
That said, this logic seems questionable. It looks kind of like it exists to avoid calling async_set_result() twice, but that shouldn't be possible anyway. It should only be possible to reach async_satisfied() once. I guess a sufficiently misbehaving client could clone the async handle and then wait on it twice, though... and probably async_set_result() doesn't protect against being called multiple times...
Yeah, `async_set_result` may need its own flag; perhaps `fully_completed` from your suggestion, `settled`, `finalized`, `completion_sent`, or `completion_notified`.
That's possibly better. In this case I think the important thing is also to document what we're trying to protect against.
- Any of:
- async_handoff() has not been called, or
I'm not sure this is a meaningful state. Until we get to async_handoff() the async isn't *really* initialized. Honestly, it wouldn't be the craziest thing in the world to defer async creation until that point, possibly by building up some sort of parameters structure that we pass to read/write/ioctl fd callbacks instead of a whole async object. But I don't know, that may be more trouble than its worth.
`queue_async(q, async);` may precede `async_handoff(async, ...);`. Since `queue_async` needs the async to be materialized, we would need to reorder these two calls.
Indeed, queuing makes it nontrivial. Probably more trouble than its worth, then.
I suppose we could introduce some sort of bool helper instead of having a direct_result field. It seems relatively clear to me, or at least, not worse than its constitutent logic, but if someone less used to the code has a suggestion, that does deserve wait. It's worth pointing out though that given that part of the condition for whether something is a direct result is "was it ever unknown_status", we'd need another bool flag anyway. I guess then the logic would be as simple as "!was_unknown_status && initial_status != STATUS_PENDING".
Rather, I want `direct_result` (or a similar flag) to mirror the client state: to be precise, whether the client side has went through the synchronous IOSB filling code section at the time of last state synchronization. Ideally, it should be strictly independent of the server or I/O operation state, so that the async code can be decomposed into an I/O state handling component and a client notification handling component.
I don't think I understand what you're trying to do here, which is a bit concerning.
"signaled" is a kind of similar case, where you could in theory break it down into a collection of simpler bits, but not all of those simpler bits currently exist. Fundamentally, "signaled" is "blocking ? fully_completed : !unknown_status".
Er, to be exact, it's `blocking ? fully_completed : (!unknown_status && initial_status_sent_to_client)`, where `initial_status_sent_to_client = irp_completion_deferred || fully_completed`.
I guess that's true, though I wouldn't really describe/design it that way. If we reported the initial status directly from the first server request, then the answer to "is the async signaled?" is "it doesn't matter", but for consistency/simplicity I think it should be "yes" rather than "no". Yes, it's signaled, but nobody's waiting on it so it doesn't matter.
- `async_wake_obj(async)` has been called when `async->unknown_status = 0`, or
async_wake_obj() is horrid, because it's basically a leaky abstraction. There should be a better way to organize this, but I'm not really sure how. It kind of looks like we should wake up the async in async_set_initial_status(), but the problem is that ntoskrnl can either return the initial status of an IRP by itself but leave the IRP pending, or it can return the initial *and* final status of a complete IRP, and for the sake of efficiency I don't think we want to wake up the async in async_set_initial_status() in the latter case. That would basically result in a SIGUSR1 when we could just return the APC by interrupting the wait. It would *work* but it would be slower.
It does not work, since the IOSB has not been filled by `APC_ASYNC_IO` from `set_irp_result` at the time of wake-up.
What I suppose is that `async_wake_obj` should just be renamed to `async_commit_pending_status` or something similar.
I don't think this is much of an improvement. It's still leaking the abstraction.
- (Regardless of `async->blocking`) `async_set_result(async, status, ...)` has been called with `async->terminated && (!async->alerted || status != STATUS_PENDING)`.
Which is to say: the client finished processing an APC and didn't restart the async.
An interesting bit in the condition above is that `STATUS_PENDING` is interpreted *differently* depending on the value of `async->alerted`. I think it's a little inconsistent; we should just have a "restart" boolean flag and fail entirely if `restart && !async->alerted`.
Possibly, yes.