From what I can tell, the recent work on SampleBias/SampleLevel did all of the work for us, we just need to take the same framework and include the case for the `sample_l` instruction.
Tested with some shaders from the native Linux version of Little Racers STREET.
--
v9: tpf: For sample_l/sample_b, set lod swizzle to SCALAR.
vkd3d-shader/tpf: Add support for emitting sample_l instructions
vkd3d-shader/hlsl: Add support for SampleGrad() method
tests: Add a test for SampleGrad() method
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/188
From what I can tell, the recent work on SampleBias/SampleLevel did all of the work for us, we just need to take the same framework and include the case for the `sample_l` instruction.
Tested with some shaders from the native Linux version of Little Racers STREET.
--
v8: vkd3d-shader/tpf: For sample_l/sample_b, set lod swizzle to SCALAR.
vkd3d-shader/tpf: Add support for emitting sample_l instructions
tests: Add a test for SampleBias() with multiple mipmap levels.
tests: Add a test for sampling from nonzero mipmap levels.
tests: Add a test for loading from nonzero mipmap levels.
tests/shader_runner: Add support for creating mipmapped textures.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/188
~~This one's marked as a draft, as there seems to be a blocker with the method parameters.~~
~~The first commit totally works, _if_ the ddx/ddy parameters are literals - they do _not_ work when passing a variable of any kind. The test comes from tests/d3d12.c, so I'm mostly just trying to migrate that to the HLSL test suite, but it currently hits an assert before we get to the resource load (which does eventually work) and I'm not sure what's causing it:~~
```
vkd3d-compiler: libs/vkd3d-shader/tpf.c:3190: sm4_register_from_node: Assertion `instr->reg.allocated' failed.
```
~~Seems like it's surprised when we try to load from the constant buffer maybe?~~ Fixed!
--
v9: vkd3d-shader/hlsl: Add support for SampleGrad() method
tests: Add a test for SampleGrad() method
tests: Add a test for SampleBias() with multiple mipmap levels.
tests: Add a test for sampling from nonzero mipmap levels.
tests: Add a test for loading from nonzero mipmap levels.
tests/shader_runner: Add support for creating mipmapped textures.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/184
Exposing the two different ways for IME to work:
* synchronous mode, as implemented by the legacy IME engine (MS Korean IME uses this) with ImmProcessKey / ImeProcessKey followed by ImmTranslateMessage / ImeToAsciiEx call if it returns TRUE
* asynchronous mode, as implemented by the MSCTF IME engine (MS Japanese IME uses this), where key input is hooked elsewhere and where WM_IME_NOTIFY messages with private wparam are used to notify the IME UI of composition updates.
--
v2: imm32/tests: Test MS Japanese IME NIHONGO-NO sequence.
imm32/tests: Test MS Korean IME GA-NA-DA sequence.
imm32/tests: Add some missing local variables declarations.
imm32/tests: Ignore some unknown WM_IME_NOTIFY messages.
imm32/tests: Print human readable IME message names.
imm32/tests: Adjust the ImmSetOpenStatus tests for MS Korean IME.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2749
Exposing the two different ways for IME to work:
* synchronous mode, as implemented by the legacy IME engine (MS Korean IME uses this) with ImmProcessKey / ImeProcessKey followed by ImmTranslateMessage / ImeToAsciiEx call if it returns TRUE
* asynchronous mode, as implemented by the MSCTF IME engine (MS Japanese IME uses this), where key input is hooked elsewhere and where WM_IME_NOTIFY messages with private wparam are used to notify the IME UI of composition updates.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2749
--
v8: vkd3d-shader/ir: Normalise signatures and input/output registers to the Shader Model 6 pattern.
vkd3d-shader/ir: Eliminate struct vkd3d_shader_normaliser.
vkd3d-shader/spirv: Support emitting multi-dimensional array variables.
vkd3d-shader/tpf: Validate input and output index ranges for default control point phases.
vkd3d-shader/tpf: Remove an unnecessary carriage return from a parser error message.
vkd3d-shader/tpf: Validate index range declarations.
vkd3d-shader/tpf: Validate input/output registers.
vkd3d-shader/tpf: Validate signature element masks.
vkd3d-shader/tpf: Validate signature element register indices.
vkd3d-shader/tpf: Validate input/output register index counts.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/181
Introduces an `idx_count` field to `struct vkd3d_shader_register`. Another patch set is needed for further validations and to use this field where applicable.
--
v7: vkd3d-shader/tpf: Validate input and output index ranges for default control point phases.
vkd3d-shader/tpf: Remove an unnecessary carriage return from a parser error message.
vkd3d-shader/tpf: Validate index range declarations.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/198
See https://bugs.winehq.org/show_bug.cgi?id=54832
I'm starting to see this particular FIXME in quite a few games (Escape Goat 2, Murder Miners, and Little Racers STREET to name a few), and since I'm not sure how to fix this I figured I could at least provide a test for someone that knows more SM4 than me :P
--
v7: tests: Add a test for arrays with an expression as the index.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/189
See https://bugs.winehq.org/show_bug.cgi?id=54832
I'm starting to see this particular FIXME in quite a few games (Escape Goat 2, Murder Miners, and Little Racers STREET to name a few), and since I'm not sure how to fix this I figured I could at least provide a test for someone that knows more SM4 than me :P
--
v6: tests: Add a test for arrays with an expression as the index.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/189
Mainly comprises support for allocating arrays of resources, and loading from them, for both SM1 and SM4.
--
v11: vkd3d-shader/hlsl: Introduce hlsl_calloc().
vkd3d-shader/hlsl: Support resource arrays when writting SM4.
vkd3d-shader/hlsl: Write resource loads in SM1.
vkd3d-shader/hlsl: Write sampler declarations in SM1.
vkd3d-shader/hlsl: Track objects sampling dimension.
vkd3d-shader/hlsl: Track object components usage and allocate registers accordingly.
tests: Test objects as parameters.
vkd3d-shader/hlsl: Skip object components when creating input/output copies.
vkd3d-shader/hlsl: Add fixme for uniform copies for objects within structs.
vkd3d-shader/hlsl: Support multiple-register variables in object regsets.
tests: Add additional texture array register reservation tests.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/159
Signed-off-by: Nikolay Sivov <nsivov(a)codeweavers.com>
--
v3: tests: Enable RWBuffer test.
vkd3d-shader/hlsl: Improve UAV format type checking for buffer types.
vkd3d-shader/hlsl: Add support for writing RWStructuredBuffer declarations.
vkd3d-shader/hlsl: Add support for RWBuffer object.
vkd3d-shader: Fix dcl_uav_typed_* formatting.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/193
--
v7: vkd3d-shader/ir: Normalise signatures and input/output registers to the Shader Model 6 pattern.
vkd3d-shader/tpf: Emit an error if an index range is declared for default control point phase input/output.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/181
Introduces an `idx_count` field to `struct vkd3d_shader_register`. Another patch set is needed for further validations and to use this field where applicable.
--
v6: vkd3d-shader/tpf: Emit an error if an index range is declared for default control point phase input/output.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/198
--
v6: vkd3d-shader/ir: Normalise signatures and input/output registers to the Shader Model 6 pattern.
vkd3d-shader/tpf: Validate input and output index ranges for default control point phases.
vkd3d-shader/tpf: Remove an unnecessary carriage return from a parser error message.
vkd3d-shader/tpf: Validate index range declarations.
vkd3d-shader/tpf: Validate input/output registers.
vkd3d-shader/tpf: Validate signature element masks.
vkd3d-shader/tpf: Validate signature element register indices.
vkd3d-shader/tpf: Validate input/output register index counts.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/181
This fixes the behavior in notepad, with Japanese ibus-mozc IME, when typing "NIHONGO-SPACE-NO", or with Korean ibus-hangul when typing "GA-NA-DA". The latter still has an issue with the last character not appearing in the a composition but I believe it is unrelated to comctl32 and will be fixed by some future imm32 changes.
--
v2: user32: Keep and display composition separate from the selection.
comctl32/edit: Keep and display composition separate from the selection.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2743
Introduces an `idx_count` field to `struct vkd3d_shader_register`. Another patch set is needed for further validations and to use this field where applicable.
--
v5: vkd3d-shader/tpf: Validate input and output index ranges for default control point phases.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/198
Introduces an `idx_count` field to `struct vkd3d_shader_register`. Another patch set is needed for further validations and to use this field where applicable.
--
v4: vkd3d-shader/tpf: Remove an unnecessary carriage return from a parser error message.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/198
This fixes the behavior in notepad, with Japanese ibus-mozc IME, when typing "NIHONGO-SPACE-NO", or with Korean ibus-hangul when typing "GA-NA-DA". The latter still has an issue with the last character not appearing in the a composition but I believe it is unrelated to comctl32 and will be fixed by some future imm32 changes.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2743
--
v3: tests: Add a test for SampleBias() with multiple mipmap levels.
tests: Add a test for sampling from nonzero mipmap levels.
tests: Add a test for loading from nonzero mipmap levels.
tests/shader_runner: Add support for creating mipmapped textures.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/191
I think this commit is great as it will reduce the maintenance cost of the different `msxml*` dlls by unifying work.
If a version of a different `msxml*` act differently than the other version we can always do something like in `.spec` files:
```
@ stdcall -private DllGetClassObject(ptr ptr ptr) winexml.MSXML*_DllGetClassObject
```
But by checking: https://gitlab.winehq.org/wine/wine/-/blob/master/dlls/msvcr120/Makefile.in
Moving all code to `msxml` and using `PARENTSRC` would seems to be more in line with what wine is already doing.
Either way is fine, the only thing that is sure is that splitting xml source work for each dll is not healthy, and code need to be cross merged in some way.
Using a `winexml` is more memory efficient, but using `msxml` + `PARENTSRC` seems more application compatible, either way is fine for me.
Also making `wine*` modules seems to become a recent common trend, see: https://gitlab.winehq.org/wine/wine/-/merge_requests/2298
@julliard I think you should make clearer guidelines about making `wine*.dll` modules, and proposing alternative to these modules in the guidelines.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2700#note_31956
Requires testing, as I only have access to a remote macOS machine.
--
v3: mmdevapi: Remove unused "channel" member in set_volumes_params.
winecoreaudio: Implement per-channel volume control.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2732
Introduces an `idx_count` field to `struct vkd3d_shader_register`. Another patch set is needed for further validations and to use this field where applicable.
--
v3: vkd3d-shader/tpf: Validate index range declarations.
vkd3d-shader/tpf: Validate input/output registers.
vkd3d-shader/tpf: Validate signature element masks.
vkd3d-shader/tpf: Validate signature element register indices.
vkd3d-shader/tpf: Validate input/output register index counts.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/198
The tests shows that, for sink writer created by MFCreateSinkWriterFromURL,
we are not able to grab the media sink object immediately after the creation,
unless the media sink object can be created without setting media type(MP3, ASF).
--
v5: mfreadwrite/tests: Test GetService for MP4 sink writer.
mfreadwrite/tests: Add more tests for MFCreateSinkWriterFromURL.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2690
Introduces an `idx_count` field to `struct vkd3d_shader_register`. Another patch set is needed for further validations and to use this field where applicable.
--
v2: vkd3d-shader/tpf: Validate input/output registers.
vkd3d-shader/tpf: Validate signature element masks.
vkd3d-shader/tpf: Validate signature element register indices.
vkd3d-shader/tpf: Validate input/output register index counts.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/198
--
v2: vkd3d-shader/hlsl: Map the colour output for ps_1_* to r0.
vkd3d-shader/hlsl: Rewrite the register allocator to allow allocating in multiple passes.
vkd3d-shader/hlsl: Avoid leaking the allocator register map in allocate_const_registers().
vkd3d-shader/hlsl: Rename struct liveness to struct register_allocator.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/168
> FWIW, it's not necessarily that everything would go directly into v_s_i (clearly HLSL wouldn't), but sm1 and sm4 at least would. Contrary to what Henri said, I was actually of the impression that sm6 could deal directly with v_s_i without adding too much? Besides new opcodes and interfaces, I thought we didn't need much more than a new "SSA" register type.
Well, there's the shader I/O signature and register normalisation work that Conor is currently working on, for example. I don't know if I'd qualify that as "much more", but it's essentially adapting the TPF frontend and SPIR-V backend to accommodate the DXIL frontend. It doesn't touch the representation of individual instructions much, so in that sense, sure. (Also, it ends up actually simplifying the SPIR-V backend by moving some transformations that aren't needed for DXIL to the common IR level where they're easier to do...)
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/174#note_31899
Mostly a refresh of https://gitlab.winehq.org/wine/wine/-/merge_requests/1278, with a bit of cleanup. More to come.
--
v2: winegstreamer: Remove unnecessary media source stream states.
winegstreamer: Synchronize access to the media source from callbacks.
winegstreamer: Synchronize concurrent access to the media stream.
winegstreamer: Synchronize concurrent access to the media source.
winegstreamer: Only break cyclic references in IMFMediaSource_Shutdown.
winegstreamer: Keep a IMFMediaSource pointer in the media stream.
winegstreamer: Query the wg_parser stream in media_stream_create.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2719
> You mean that you'd have just `v_s_i` and every frontend and backend deals directly with it, including for very syntactical things like assembling and disassembling? I don't feel really convinced, it seems to me that you'd have to encode su much stuff in `v_s_i` that each time you need to change something (for example to allow for a new language) the changes can have repercussions in each other language frontend or backend.
Yeah, that's the primary concern.
It's mostly at this point weighing that against the idea of introducing new IRs and translation passes for everything, and also writing specific assemblers and disassemblers for everything, that makes me think keeping v_s_i as complex as it is isn't actually that bad.
FWIW, it's not necessarily that everything would go directly into v_s_i (clearly HLSL wouldn't), but sm1 and sm4 at least would. Contrary to what Henri said, I was actually of the impression that sm6 could deal directly with v_s_i without adding too much? Besides new opcodes and interfaces, I thought we didn't need much more than a new "SSA" register type.
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/174#note_31897
Signed-off-by: Nikolay Sivov <nsivov(a)codeweavers.com>
--
v2: vkd3d-shader/hlsl: Add support for writing RWStructuredBuffer declarations.
vkd3d-shader/hlsl: Add support for RWBuffer object.
vkd3d-shader: Fix dcl_uav_typed_* formatting.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/193
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=18889
--
v6: ntdll: Don't hard-code DLL manifest resource ID when looking up dependency assembly.
kernel32/tests: Test loading assembly manifest resource inside dependencies.
ntdll: Move ACTCTX lpResourceName validation to RtlCreateActivationContext.
kernel32/tests: Test setting lpResourceName to NULL for CreateActCtxW.
kernel32/tests: Remove test for ACTCTX_FLAG_HMODULE_VALID with hModule = NULL case.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2555
Note: typeof (int * unsigned) is unsigned.
So:
- on 64bit CPUs, where sizeof(int) = 4 < sizeof(void*) = 8,
- when the result of the multiplication is supposed to be negative
- there's no progation of the negative sign from 32bit to 64 bit integers
Fixes a crash in Age of Empire II.
Signed-off-by: Eric Pouech <epouech(a)codeweavers.com>
--
v2: evr: Fix incorrect integral computation.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2723
--
v3: tests/d3d12: Test register relative addressing in vertex and pixel shaders.
vkd3d-shader: Introduce an internal sm6 signature structure.
vkd3d-shader/tpf: Return an error from vkd3d_shader_sm4_parser_create() if the parser failed.
vkd3d-shader/d3dbc: Return an error from vkd3d_shader_sm1_parser_create() if the parser failed.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/197
This patch makes index expressions on resources hlsl_ir_index nodes
instead of hlsl_ir_resource_load nodes, because it is not known if they
will be used later as the lhs of an hlsl_ir_resource_store.
For now, the only benefit is consistency.
--
v3: vkd3d-shader/hlsl: Don't keep the implicit mipmap level on hlsl_ir_index.
vkd3d-shader/hlsl: Use hlsl_ir_index for resource access.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/182
--
v2: tests: Add a test for SampleBias() with multiple mipmap levels.
tests: Add a test for sampling from nonzero mipmap levels.
tests: Add a test for loading from nonzero mipmap levels.
tests/shader_runner: Add support for creating mipmapped textures.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/191
~~This one's marked as a draft, as there seems to be a blocker with the method parameters.~~
~~The first commit totally works, _if_ the ddx/ddy parameters are literals - they do _not_ work when passing a variable of any kind. The test comes from tests/d3d12.c, so I'm mostly just trying to migrate that to the HLSL test suite, but it currently hits an assert before we get to the resource load (which does eventually work) and I'm not sure what's causing it:~~
```
vkd3d-compiler: libs/vkd3d-shader/tpf.c:3190: sm4_register_from_node: Assertion `instr->reg.allocated' failed.
```
~~Seems like it's surprised when we try to load from the constant buffer maybe?~~ Fixed!
--
v8: vkd3d-shader/hlsl: Add support for SampleGrad() method
tests: Add a test for SampleGrad() method
tests: Add a test for SampleBias() with multiple mipmap levels.
tests: Add a test for sampling from nonzero mipmap levels.
tests: Add a test for loading from nonzero mipmap levels.
tests/shader_runner: Add support for creating mipmapped textures.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/184
Requires testing, as I only have access to a remote macOS machine.
--
v2: mmdevapi: Remove unused "channel" member in set_volumes_params.
winecoreaudio: Implement per-channel volume control.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2732
This prevents hangs when a program sets a new topology after stopping the current
topology, because if we don't reset the flags to 0 the session will not subscribe
to the events of the new topology sources.
--
v2: mf/session: Reset presentation flags when session_clear_presentation is called.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2736
--
v2: vkd3d-shader/tpf: Fail parsing if an input/output parameter order is > 2.
tests/d3d12: Test register relative addressing in vertex and pixel shaders.
vkd3d-shader: Introduce an internal sm6 signature structure.
vkd3d-shader/tpf: Return an error from vkd3d_shader_sm4_parser_create() if the parser failed.
vkd3d-shader/d3dbc: Return an error from vkd3d_shader_sm1_parser_create() if the parser failed.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/197
From what I can tell, the recent work on SampleBias/SampleLevel did all of the work for us, we just need to take the same framework and include the case for the `sample_l` instruction.
Tested with some shaders from the native Linux version of Little Racers STREET.
--
v7: vkd3d-shader/tpf: For sample_l/sample_b, set lod swizzle to SCALAR.
vkd3d-shader/tpf: Add support for emitting sample_l instructions
tests: Add a test for SampleBias() with multiple mipmap levels.
tests: Add a test for sampling from nonzero mipmap levels.
tests: Add a test for loading from nonzero mipmap levels.
tests/shader_runner: Add support for creating mipmapped textures.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/188
This prevents hangs when a program sets a new topology after stopping the current
topology, because if we don't reset the flags to 0 the session will not subscribe
to the events of the new topology sources.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2736
Mainly comprises support for allocating arrays of resources, and loading from them, for both SM1 and SM4.
--
v9: vkd3d-shader/hlsl: Support resource arrays when writting SM4.
vkd3d-shader/hlsl: Write resource loads in SM1.
vkd3d-shader/hlsl: Write sampler declarations in SM1.
vkd3d-shader/hlsl: Track objects sampling dimension.
vkd3d-shader/hlsl: Track object components usage and allocate registers accordingly.
tests: Test objects as parameters.
vkd3d-shader/hlsl: Skip object components when creating input/output copies.
vkd3d-shader/hlsl: Add fixme for uniform copies for objects within structs.
vkd3d-shader/hlsl: Support multiple-register variables in object regsets.
tests: Add additional texture array register reservation tests.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/159
Company of Heroes: Battle of Crete needs a functioning tasklist.exe to exit properly.
--
v5: tasklist: Partially support '/fi' option.
tasklist: Support '/fo' option.
tasklist: Support '/nh' option.
tasklist: Add basic functionality.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2688
On Tue May 2 20:07:50 2023 +0000, Alexandre Julliard wrote:
> I'm not convinced that this justifies a Wine-specific module.
> The usual solution for problems with native dlls is "don't do that",
> i.e. fix the builtin so that native is not required.
> It's hard to tell without more details, but in this case an app-specific
> dll override may also be enough.
That's fair. How would you feel if the guts were moved from msxml3 to msxml? Native overrides of msxml1 are presumably very uncommon (there's nothing in winetricks, e.g.), so it would basically accomplish the same independence of the msxml*.dlls.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2700#note_31794
Mainly comprises support for allocating arrays of resources, and loading from them, for both SM1 and SM4.
--
v8: vkd3d-shader/hlsl: Support resource arrays when writting SM4.
vkd3d-shader/hlsl: Write resource loads in SM1.
vkd3d-shader/hlsl: Write sampler declarations in SM1.
vkd3d-shader/hlsl: Track objects sampling dimension.
vkd3d-shader/hlsl: Track object components usage and allocate registers accordingly.
tests: Test objects as parameters.
vkd3d-shader/hlsl: Skip object components when creating input/output copies.
vkd3d-shader/hlsl: Add fixme for uniform copies for objects within structs.
vkd3d-shader/hlsl: Support multiple-register variables in object regsets.
tests: Add additional texture array register reservation tests.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/159
> > > I'd propose that the constituent conditions of each flag be refactored into separate state variables first. Paradoxically, it may even help combine two or more binary flags into an enum as the semantics become more simpler.
> >
> > Probably, but it's not clear to me how exactly to improve them?
>
> I'm yet to come with a concrete proposal, but the basic idea is that they'll be decomposed until some subset of components can be condensed into a phase enum (or the like).
>
> We can already do this with `signaled` and `unknown_status`, for (a not really good) example.
Having a phase enumeration would be great if we can make it work, but I'm not sure that we can really decompose anything that currently exists. Decomposing either of these two would require introducing new flags. Granted, those might have simpler meaning, but...
> > > * `direct_result`: from Wine codebase, `a flag if we're passing result directly from request instead of APC`.
> >
> > The idea behind direct_result is that there are two paths for asyncs:
> >
> > * Requests that can be filled immediately. The idea here is then that we can do this in two server requests: one to send and/or retrieve the data, then the client can set iosb and/or output buffers, and then the second server request is done to trigger completion after the iosb and output buffers have been filled. This is the "direct_result" case. The prototypical example is a named pipe ioctl. The first request varies, but as you've observed, it's basically anything that calls create_request_async(), and then the request handler needs to not pend the async.
>
> Nit: "pend the async" may need some clarification. There are two possible interpretations to "pend an async":
>
> 1. Queueing an async for deferred asynchronous completion.
> 1. Simply calling `set_async_pending(async)`.
>
> For case 2, `direct_result` can still be true if we terminate the async immediately afterwards. This is akin to calling `IoMarkIrpPending(Irp);` and immediately following it with a call to `IoCompleteRequest(Irp, IO_NO_INCREMENT);`.
>
> I'll assume that you meant case 1 here.
Yes, that's what I meant.
>
> > * Requests that can't be filled immediately. In this case we use an APC to notify the client when the request is filled. This is the "!direct_result" case. The APC is a callback in a sense, but you can also look at it as two separate requests. As with the direct_result case, we need the server to return the iosb and output buffer (the APC arguments), then the client fills them, then it sends another request to the server (returning from the APC) that lets the server queue completion.
> >
> > > * The async was created via create_request_async(), and
> >
> > The other path is register_async, and I think that should just go away. The idea of register_async is that the client tries to do i/o, gets EWOULDBLOCK, and then sends an async to the server. It was once used for sockets, and now I believe only applies to some "special" unix file types: serial, char device, mailslot, probably also FIFO. Besides being kind of redundant, it has a fundamental problem, as we've discovered with sockets: inauspiciously timed I/O on a "ready" file can jump ahead of queued I/O. I think in general we want to replace it with something that looks like the socket path.
>
> I think that depends on whether the file object serializes I/O requests. If the file object can complete I/O requests in a nondeterministic order, then I don't think the approach merits extra round-tripping overhead.
Well, it's serial devices and mailslots, which can probably be tested on Windows, and char/FIFO devices, which can't. But honestly, I'm not sure testing is worth the trouble. It'd be impossible to prove for sure that Windows queues I/O, but I think it's also the only sensible thing to do.
Perhaps more importantly, I think the lesser number of paths we have for asyncs, the better. If we can eliminate one potential outlier by making them work like sockets, it'll be that much easier to work with the async logic.
> Side note: what if `server_read_file` is blocked in an alertable wait for an async handle with `async->unknown_status == 1`, and the blocked thread is sent a user APC? I suppose the thread should not be alerted *until after* the driver returns from the dispatcher.
Yeah, that part isn't really correct as-is. Perhaps more importantly, if the async returns success I don't think the thread should be alerted at all. It's probably not a huge deal, though, since FILE_SYNCHRONOUS_IO_ALERT is not actually exposed through kernel32.
>
> > That said, this logic seems questionable. It looks kind of like it exists to avoid calling async_set_result() twice, but that shouldn't be possible anyway. It should only be possible to reach async_satisfied() once. I guess a sufficiently misbehaving client could clone the async handle and then wait on it twice, though... and probably async_set_result() doesn't protect against being called multiple times...
>
> Yeah, `async_set_result` may need its own flag; perhaps `fully_completed` from your suggestion, `settled`, `finalized`, `completion_sent`, or `completion_notified`.
That's possibly better. In this case I think the important thing is also to document what we're trying to protect against.
>
> > > * Any of:
> > > * async_handoff() has not been called, or
> >
> > I'm not sure this is a meaningful state. Until we get to async_handoff() the async isn't *really* initialized. Honestly, it wouldn't be the craziest thing in the world to defer async creation until that point, possibly by building up some sort of parameters structure that we pass to read/write/ioctl fd callbacks instead of a whole async object. But I don't know, that may be more trouble than its worth.
>
> `queue_async(q, async);` may precede `async_handoff(async, ...);`. Since `queue_async` needs the async to be materialized, we would need to reorder these two calls.
Indeed, queuing makes it nontrivial. Probably more trouble than its worth, then.
>
> > I suppose we could introduce some sort of bool helper instead of having a direct_result field. It seems relatively clear to me, or at least, not worse than its constitutent logic, but if someone less used to the code has a suggestion, that does deserve wait. It's worth pointing out though that given that part of the condition for whether something is a direct result is "was it ever unknown_status", we'd need another bool flag anyway. I guess then the logic would be as simple as "!was_unknown_status && initial_status != STATUS_PENDING".
>
> Rather, I want `direct_result` (or a similar flag) to mirror the client state: to be precise, whether the client side has went through the synchronous IOSB filling code section at the time of last state synchronization. Ideally, it should be strictly independent of the server or I/O operation state, so that the async code can be decomposed into an I/O state handling component and a client notification handling component.
I don't think I understand what you're trying to do here, which is a bit concerning.
>
> > "signaled" is a kind of similar case, where you could in theory break it down into a collection of simpler bits, but not all of those simpler bits currently exist. Fundamentally, "signaled" is "blocking ? fully_completed : !unknown_status".
>
> Er, to be exact, it's `blocking ? fully_completed : (!unknown_status && initial_status_sent_to_client)`, where `initial_status_sent_to_client = irp_completion_deferred || fully_completed`.
I guess that's true, though I wouldn't really describe/design it that way. If we reported the initial status directly from the first server request, then the answer to "is the async signaled?" is "it doesn't matter", but for consistency/simplicity I think it should be "yes" rather than "no". Yes, it's signaled, but nobody's waiting on it so it doesn't matter.
>
> > > * `async_wake_obj(async)` has been called when `async->unknown_status = 0`, or
> >
> > async_wake_obj() is horrid, because it's basically a leaky abstraction. There should be a better way to organize this, but I'm not really sure how. It kind of looks like we should wake up the async in async_set_initial_status(), but the problem is that ntoskrnl can either return the initial status of an IRP by itself but leave the IRP pending, or it can return the initial *and* final status of a complete IRP, and for the sake of efficiency I don't think we want to wake up the async in async_set_initial_status() in the latter case. That would basically result in a SIGUSR1 when we could just return the APC by interrupting the wait. It would *work* but it would be slower.
>
> It does not work, since the IOSB has not been filled by `APC_ASYNC_IO` from `set_irp_result` at the time of wake-up.
>
> What I suppose is that `async_wake_obj` should just be renamed to `async_commit_pending_status` or something similar.
I don't think this is much of an improvement. It's still leaking the abstraction.
> > > * (Regardless of `async->blocking`) `async_set_result(async, status, ...)` has been called with `async->terminated && (!async->alerted || status != STATUS_PENDING)`.
> >
> > Which is to say: the client finished processing an APC and didn't restart the async.
>
> An interesting bit in the condition above is that `STATUS_PENDING` is interpreted *differently* depending on the value of `async->alerted`. I think it's a little inconsistent; we should just have a "restart" boolean flag and fail entirely if `restart && !async->alerted`.
Possibly, yes.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/499#note_31791
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=54901
The pitch is treated as a byte offset between the rows of blocks in tx_compress_dxtn(), so is the Pitch from _LockRect() so it doesn't need a fixup.
That pitch has any effect only when dst texture stride is larger than width being compressed, that's why it was mostly not causing problems before.
--
v2: d3dx9: Fix dst pitch for compressed format in D3DXLoadSurfaceFromMemory().
https://gitlab.winehq.org/wine/wine/-/merge_requests/2725