> * The flags are defined in terms of what the flags *do* rather than what the value of the flags *mean*. It doesn't help that the transitions for each flag are quite scattered and sparse.
> * Their behaviors are dependent on the client-side async implementation, so reasoning about them requires verifying how the requestor side behaves.
> * Depending on other part of the async object state, one or more flags may be left unobserved and thus ignored, effectivly becoming a "don't care" bit.
>
> This increases complexity and hinders maintenance in my opinion. In particular, it's difficult to determine precisely *when* the flag is or should be set or unset.
>
> I'd propose that the constituent conditions of each flag be refactored into separate state variables first. Paradoxically, it may even help combine two or more binary flags into an enum as the semantics become more simpler.
Probably, but it's not clear to me how exactly to improve them?
> * `direct_result`: from Wine codebase, `a flag if we're passing result directly from request instead of APC`.
The idea behind direct_result is that there are two paths for asyncs:
* Requests that can be filled immediately. The idea here is then that we can do this in two server requests: one to send and/or retrieve the data, then the client can set iosb and/or output buffers, and then the second server request is done to trigger completion after the iosb and output buffers have been filled. This is the "direct_result" case. The prototypical example is a named pipe ioctl. The first request varies, but as you've observed, it's basically anything that calls create_request_async(), and then the request handler needs to not pend the async.
* Requests that can't be filled immediately. In this case we use an APC to notify the client when the request is filled. This is the "!direct_result" case. The APC is a callback in a sense, but you can also look at it as two separate requests. As with the direct_result case, we need the server to return the iosb and output buffer (the APC arguments), then the client fills them, then it sends another request to the server (returning from the APC) that lets the server queue completion.
> * The async was created via create_request_async(), and
The other path is register_async, and I think that should just go away. The idea of register_async is that the client tries to do i/o, gets EWOULDBLOCK, and then sends an async to the server. It was once used for sockets, and now I believe only applies to some "special" unix file types: serial, char device, mailslot, probably also FIFO. Besides being kind of redundant, it has a fundamental problem, as we've discovered with sockets: inauspiciously timed I/O on a "ready" file can jump ahead of queued I/O. I think in general we want to replace it with something that looks like the socket path.
For the time being, though, you can think of register_async as being a "normal" request async that has initial_status pre-initialized to STATUS_PENDING.
> * The async handle has never been waited on (wait consumes the flag), and
Well, a *successful* wait consumes the flag; the difference is probably a bit pedantic, but if you're trying to fully understand asyncs, probably does still matter :-)
That said, this logic seems questionable. It looks kind of like it exists to avoid calling async_set_result() twice, but that shouldn't be possible anyway. It should only be possible to reach async_satisfied() once. I guess a sufficiently misbehaving client could clone the async handle and then wait on it twice, though... and probably async_set_result() doesn't protect against being called multiple times...
> * Any of:
> * async_handoff() has not been called, or
I'm not sure this is a meaningful state. Until we get to async_handoff() the async isn't *really* initialized. Honestly, it wouldn't be the craziest thing in the world to defer async creation until that point, possibly by building up some sort of parameters structure that we pass to read/write/ioctl fd callbacks instead of a whole async object. But I don't know, that may be more trouble than its worth.
I suppose we could introduce some sort of bool helper instead of having a direct_result field. It seems relatively clear to me, or at least, not worse than its constitutent logic, but if someone less used to the code has a suggestion, that does deserve wait. It's worth pointing out though that given that part of the condition for whether something is a direct result is "was it ever unknown_status", we'd need another bool flag anyway. I guess then the logic would be as simple as "!was_unknown_status && initial_status != STATUS_PENDING".
"signaled" is a kind of similar case, where you could in theory break it down into a collection of simpler bits, but not all of those simpler bits currently exist. Fundamentally, "signaled" is "blocking ? fully_completed : !unknown_status".
> * `async_wake_obj(async)` has been called when `async->unknown_status = 0`, or
async_wake_obj() is horrid, because it's basically a leaky abstraction. There should be a better way to organize this, but I'm not really sure how. It kind of looks like we should wake up the async in async_set_initial_status(), but the problem is that ntoskrnl can either return the initial status of an IRP by itself but leave the IRP pending, or it can return the initial *and* final status of a complete IRP, and for the sake of efficiency I don't think we want to wake up the async in async_set_initial_status() in the latter case. That would basically result in a SIGUSR1 when we could just return the APC by interrupting the wait. It would *work* but it would be slower.
> * `async_handoff(async, ...)` has been called with `get_error() != STATUS_ALERTED && (async->pending || !NT_ERROR(get_error()) && (get_error() != STATUS_PENDING || async->iosb->status != STATUS_PENDING)`.
What makes this ugly, I think, is that async_handoff() just has a really complex convention, that overloads STATUS_PENDING in multiple ways. That is really what ought to be changed; we should just be more explicit about what we want in the request handler. dd58bf9ce23 was a change in this direction, and I think we can do more.
> * (Regardless of `async->blocking`) `async_set_result(async, status, ...)` has been called with `async->terminated && (!async->alerted || status != STATUS_PENDING)`.
Which is to say: the client finished processing an APC and didn't restart the async.
> * If `!async->blocking`, it tries to signal as soon as the initial status is known, *unless* the wait handle is being closed, in which case it doesn't bother to touch the flag at all.
Yeah... that's maybe not great for clarity, but on the other hand, it's also true that "signaled" won't ever be read in that case...
> Independent of above, I'm thinking about introducing a new `completed` flag and replacing `terminated` with `completed || alerted`.
That sounds like probably a good idea, yeah.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/499#note_31672
See https://bugs.winehq.org/show_bug.cgi?id=54832
I'm starting to see this particular FIXME in quite a few games (Escape Goat 2, Murder Miners, and Little Racers STREET to name a few), and since I'm not sure how to fix this I figured I could at least provide a test for someone that knows more SM4 than me :P
--
v4: tests: Add a test for arrays with an expression as the index.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/189
> > > In practice I see a number of difficulties:
> > >
> > > * Shaders are not just code; they also embed interface information (input and output signatures, uniforms, buffers, object bindings, ...) and other random metadata (for example for the hull shaders). Should this be representable in the common IR?
> >
> > We need *some* way to represent that information so it doesn't get lost, at least. I think we don't want side channels, to be clear: I think that there should be a point where all of the information relevant to a shader is represented in v_s_i format, including metadata.
> >
> > And contrary to the instructions themselves, I think we want this information to be in a relatively simple and well-designed form. In a sense, it doesn't need to be "complex" like the instructions since we're not doing optimization passes over it \[if that makes sense\]. I think Conor has basically been pushing v_s_i in this direction in order to accommodate sm6 (at least wrt semantics), and given the tribulations I've had to endure to get sm1 to work, I think that's the right approach.
>
> Yes, although I don't think that necessarily means storing these as "instructions". The approach I'm personally leaning towards at this point is to rebrand "struct vkd3d_shader_instruction_array" as "struct vkd3d_shader_program" (or something along those lines), and then add the input/output/patch constant signatures and the shader version structure to that structure. I think a similar approach should work for e.g. RDEF as well.
Yes, agreed wholeheartedly there.
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/174#note_31662
> This is causing several GStreamer-Video-CRITICAL messages to be printed while running the tests:
Indeed. Looking into these might be a good idea, but on the other hand, if there's no actual regression in behaviour then I don't think we care.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2546#note_31661
See https://bugs.winehq.org/show_bug.cgi?id=54832
I'm starting to see this particular FIXME in quite a few games (Escape Goat 2, Murder Miners, and Little Racers STREET to name a few), and since I'm not sure how to fix this I figured I could at least provide a test for someone that knows more SM4 than me :P
--
v3: tests: Add a test for arrays with an expression as the index.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/189
Otherwise, in the added test, we get:
```
vkd3d-compiler: vkd3d-shader/hlsl.c:452: hlsl_init_deref_from_index_chain: Assertion `chain' failed.
```
because on the path that triggers the following error:
```
E5002: Wrong type for argument 1 of 'tex3D': expected 'sampler' or 'sampler3D', but got 'sampler2D'.
```
a NULL params.resource is passed to hlsl_new_resource_load() and
then to hlsl_init_deref_from_index_chain().
--
v2: vkd3d-shader/hlsl: Always specify resource on intrinsic_tex().
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/183