This comes from behavioral study of Windows, which doesn't seem to check if the
lock is actually exclusively held, and just simply decrement the value stored
in the lock.
This fixes a dead lock which prevents WeCom from starting up.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3504
OPEN_MAX is 10240, which is usually lower than the kern.maxfilesperproc sysctl
value. Said value from sysctl works on Mojave to Monterey (and most likely
earlier Mac OS versions, but I can't test). Since Big Sur we can successfully
set the reported rlim_max value.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3503
On Thu Aug 3 07:27:22 2023 +0000, Hans Leidekker wrote:
> Can you give an example where wprintf() doesn't do the right thing?
Printing a non-english username.
`wprintf()` by default doesn't print non-english letters at all (I remember testing various system LC_ALL variants, like `C.UTF-8`, `en_US.UTF-8`, `ru_RU.UTF-8`, and none of that took an effect).
With `setlocale(LC_ALL, something)` it does print the letters, but
1. Only of a language that is set, so I presume it chooses something like a `cp1251` equivalent (an 8-bit language-specific encoding?).
2. It isn't used in any of other wine programs, while that `ConsoleWriteW()` hack is used at least in present whoami and xcopy implementations (probably more).
This implementation, in turn, behaves exactly like the Windows whoami.
Also when testing the `klist` I noticed that on Windows it was printing `?` signs instead of `Server` string (somehow hooked up a translation?), so `ConsoleWriteW()` seems to be necessary for any translatable console programs if one wants to avoid `setlocale()`.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3473#note_41167
SPIR-V says this is undefined behaviour, but Direct3D actually specifies that it
should clamp. Most drivers do clamp, but llvmpipe does not. Hence this fixes a
couple of tests with llvmpipe.
This does of course add overhead to the ftou instruction, but I cannot imagine
that it is a common instruction to execute. This also is not the first time we
perform such checks; cf. udiv.
--
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/289
--
v2: vkd3d-shader/hlsl: Use hlsl_block_add_instr() in clone_block().
vkd3d-shader/hlsl: Clean up the static_initializers block when the context is destroyed (Valgrind).
vkd3d-shader/hlsl: Pass a hlsl_block pointer to dump_instr_list().
vkd3d-shader/hlsl: Store the "instrs" field of struct hlsl_attribute as a hlsl_block.
vkd3d-shader/hlsl: Pass an hlsl_block pointer to add_load_component().
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/290
Hans Leidekker (@hans) commented about programs/whoami/main.c:
> +#define WIN32_LEAN_AND_MEAN
> #define SECURITY_WIN32
> #include <windows.h>
> #include <security.h>
> +#include <sddl.h>
> +#include <stdarg.h>
>
> #include "wine/debug.h"
>
> WINE_DEFAULT_DEBUG_CHANNEL(whoami);
>
> -static int output_write(const WCHAR* str, int len)
> +/* Behaves like the regular wprintf, but is able to fall back to the OEMCP
> + * when redirected.
> + * Most importantly, it prints fully in Unicode by default,
> + * so any language characters are shown regardless of the current locale.
Can you give an example where wprintf() doesn't do the right thing?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3473#note_41157
Today, if media_source_Shutdown is called around the same time
as a work item is put on the async_commands_queue, we end up in
a deadlock if Shutdown enters media source's cs first, as it
waits for the queue's callback to finish, which, in turn,
waits for the object's cs to be released.
To avoid this leave the cs, before unlocking the workqueue,
to let any callback on the queue finish running.
Signed-off-by: Bernhard Kölbl <bkoelbl(a)codeweavers.com>
Blocks !3351
--
v2: winegstreamer: Leave media source critical section before unlocking workqueue.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3491
--
v3: mf/tests: Test IMFClockStateSink in shutdown state.
mf/tests: Test sample processing for MPEG4 media sink.
mf/tests: Use IMFMediaEventGenerator interface in event wait helper.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3369
Recap:
I implemented relative addressing in !229, but I was suggested to handle register indexes by moving all the registers to the heap, which I did in my [nonconst-offsets-6](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/noncon… branch. A part of this implementation required !269, but I was asked to try to turn the sm4 register structs into the vkd3d-shader register structs instead, to get closer to a common low level IR.
This is the first part of that transformation. The whole thing is in my [use_vkd3d_reg_4](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/use_vkd3d… branch.
~~This is built on top of !225 (the first two patches), so it may be good to get that upstream first.~~
---
This patch series aims to do the following replacements:
```
struct sm4_register -> struct vkd3d_shader_register
struct sm4_dst_register -> struct vkd3d_shader_dst_param
struct sm4_src_register -> struct vkd3d_shader_src_param
```
to get us closer to a common level IR and simplify the implementation of relative-addressing.
To achieve this, the fields in the sm4 register structs are replaced with fields in the vkd3d-shader structs or removed altogether, one at the time.
As can be seen when looking at the whole branch, it is possible to do this transformation without having to add additional fields to `struct vkd3d_shader_register`, by restricting each register type to a single `enum vkd3d_sm4_dimension` (and its src registers to a single `enum vkd3d_sm4_swizzle_type`) by default.
The only exception we need so far is for sampler registers: They always have dimension NONE, except when used as arguments of gather instructions, in which case they have dimension VEC4 and SCALAR swizzle. This, and similar exceptions we may find in the future, can be handled using the opcode_info, as in 7/8 [81e17506](https://gitlab.winehq.org/fcasas/vkd3d/-/commit/81e17506ba2cb1fbf….
--
v6: vkd3d-shader/tpf: Get rid of sm4_register.dim.
vkd3d-shader/tpf: Use 's' in src_info to expect sampler register with vec4 dimension.
vkd3d-shader/tpf: Separate dst register write function.
vkd3d-shader/tpf: Separate src register write function.
vkd3d-shader/tpf: Make register_type_table an array of structs with lookup tables.
vkd3d-shader/tpf: Introduce struct tpf_writer to group sm4 write arguments.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/281
On Wed Aug 2 20:04:56 2023 +0000, Bernhard Kölbl wrote:
> Since no one answered, yet:
> 1. IIRC we don't want ReactOS code in Wine, as it might be reverse engineered.
> 2. This very likely needs tests.
I wrote the functions in question here first; and also put up a PR over there with a modified version of it.
I can think about some tests.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3360#note_41123
1. Subtracting integers may result in an overflow or underflow.
2. Right at the 'edge' of underflowing, the result of subtraction may be
`INT_MIN`, and the call to `abs()` will also result in `INT_MIN`.
This fix accounts for all of these conditions.
EXAMPLES:
1. can be encountered by comparing 2.0 and -2.0
2. can be encountered by comparing -2.0 and 2.0
NOTE:
There are 14 more instances of `compare_float` across several modules.
I would like to make sure the logic is sound with this MR, then I can take on the rest.
--
v4: ddraw/tests: Use compare_uint() in compare_float() instead of abs().
https://gitlab.winehq.org/wine/wine/-/merge_requests/3458
On Wed Aug 2 20:04:56 2023 +0000, Huw Campbell wrote:
> I had a review from the ReactOS guys with some suggestions, but before I
> make them is there any interest in taking up maintenance changes like
> the for the shell?
> Who should I ping for a review?
Since no one answered, yet:
1. IIRC we don't want ReactOS code in Wine, as it might be reverse engineered.
2. This very likely needs tests.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3360#note_41115
Today, if media_source_Shutdown is called around the same time
as a work item is put on the async_commands_queue, we end up in
a deadlock if Shutdown enters media source's cs first, as it
waits for the queue's callback to finish, which, in turn,
waits for the object's cs to be released.
To avoid this leave the cs, before unlocking the workqueue,
to let any callback on the queue finish running.
Signed-off-by: Bernhard Kölbl <bkoelbl(a)codeweavers.com>
Blocks !3351
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3491
I expect this is going to be tricky to get in. I ran into the following issues:
* test_ShowWindow behaves very strangely on Windows. It seems this isn't typical behavior, and is caused by an interaction with test_SetFocus, but I'm not sure exactly what it does to the thread state that causes this.
* Many of the SetWindowPos flag combinations tested don't actually show the window on Windows, therefore they don't send EVENT_SYSTEM_FOREGROUND. I was able to reproduce this with a stand-alone test, so it seems to be normal behavior.
* Windows can show a window without activating it, and Wine on X11 can't do that reliably for managed windows.
* There are a couple of cases where Windows sends an event and Wine doesn't. I figure it's OK to not cover every case, but I can go back and investigate those if needed.
This interacts with https://gitlab.winehq.org/wine/wine/-/merge_requests/2314, in that the tests don't really test much without it. I applied that MR for my own testing.
--
v3: win32u: Implement EVENT_SYSTEM_FOREGROUND.
user32: Run tests that notice focus changes early.
https://gitlab.winehq.org/wine/wine/-/merge_requests/2853
This is probably the most ugly and controversial bit of API. I don't really know
if this is the right approach to solving this.
sm4 registers match by register index, such that shaders can mostly be compiled
in isolation. sm1 does not—registers may be specified in any order in the vertex
and pixel shaders, and will be matched by usage and usage index.
By itself this is not much of a problem. Where it gets hairy is that we want to
do some degree of caching, as well as pre-compilation, and avoid recompiling
either shader every time it's matched with a new one.
Wine currently deals with this problem, for GLSL, by generating a "main" GLSL
shader for the vertex shader, and then an extra function setup_vs_output(), and
linking the two together every time a new pixel shader is used. This could in
theory be used for SPIR-V, but it requires the use of extra, probably external,
code to link SPIR-V shaders together, which I do not particularly anticipate
being well-received.
(I'm not sure how Wine deals with this problem in the ARB backend. It seems to
take the pixel shader signature into account when generating the vertex shader—
cf. init_output_registers()—but it doesn't take it into account when looking up
a vertex shader variant? I didn't look too closely at the code, so maybe I'm
missing something.)
--
The vkd3d parts of this patch are quite straightforward, and looking at them, I
think the design is quite intuitive in isolation. There may be some room for
internal refactoring (in particular with an eye to not so much overloading
the "register_index" field of struct shader_signature_element) but I'm
relatively happy with the way it turned out. In isolation, that is.
The Wine part is worse. I've uploaded branches for vkd3d and Wine that use this
API, and correctly handle shaders with some nontrivial reordering:
https://gitlab.winehq.org/zfigura/vkd3d/-/commits/himavant5https://gitlab.winehq.org/zfigura/wine/-/commits/himavant_cb
The test can be run, as before, like so:
make tests/shader_runner.exe && WINE_D3D_CONFIG=renderer=vulkan wine tests/shader_runner.exe ../vkd3d/tests/hlsl/sm1-interstage-interface.shader_test
The interesting Wine parts are concentrated in a single patch, 5cfb9d930f11e.
The patch takes a few shortcuts, partly because I wanted not to block the vkd3d
API design questions, but also because while writing it I came up with a couple
of problems that I wasn't sure how to fix. There are two main problems I see:
(1) This patch has the user pass the signature from the pixel shader when
compiling the vertex shader, and looks up register indices already
arbitrarily allocated by the pixel shader. This is problematic when trying
to use this signature as a cache key, by virtue of it not being clear (or
even defined) which fields are key elements and which aren't. It's also not
particularly kind to lookup, on account of not being directly comparable
with memcmp(). There are a few options I see:
(a) Provide an internal function to compare key elements. This feels... odd,
like a very special-purpose function, but perhaps workable.
(b) Just make the user deal with it, and assert that all fields are key
elements.
(c) Use some alternative, perhaps shortened structure as a field of
vkd3d_shader_next_stage_info. This has the disadvantage that it is not
as simple for a hypothetical user to retrieve from the pixel shader, but
we would presumably provide a function to generate one from a shader
signature. This would probably also be kinder to cache lookup if it's
shorter.
(d) Make caching vkd3d's responsibility, to some degree. This seems
daunting, but the more we optimize, the more difficult it may be to
design API that allows for nice caching.
(2) Assuming we use signatures, there is a memory management problem that
5cfb9d930f11e spells out. This is probably a matter of "just fix it", but
I suppose another option is to take the GLSL or ARB architecture.
--
April is the cruellest month, breeding
Lilacs out of the dead land, mixing
Memory and desire, stirring
Dull roots with spring rain.
--
v8: vkd3d-shader: Introduce a function to build a varying map between sm1 stages.
vkd3d-shader/spirv: Make output varyings not consumed by the next stage private variables.
vkd3d-shader: Implement remapping shader output registers to match the next shader's semantics.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/280
It's no problem to send fewer of these per MR. I have included the complete set because all but the last introduce no functional changes, and upstreaming a smaller set would leave the changes in a half-done state with unnecessary buffering.
--
This merge request has too many patches to be relayed via email.
Please visit the URL below to see the contents of the merge request.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/294
Recap:
I implemented relative addressing in !229, but I was suggested to handle register indexes by moving all the registers to the heap, which I did in my [nonconst-offsets-6](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/noncon… branch. A part of this implementation required !269, but I was asked to try to turn the sm4 register structs into the vkd3d-shader register structs instead, to get closer to a common low level IR.
This is the first part of that transformation. The whole thing is in my [use_vkd3d_reg](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/use_vkd3d_r… branch.
~~This is built on top of !225 (the first two patches), so it may be good to get that upstream first.~~
---
This patch series aims to do the following replacements:
```
struct sm4_register -> struct vkd3d_shader_register
struct sm4_dst_register -> struct vkd3d_shader_dst_param
struct sm4_src_register -> struct vkd3d_shader_src_param
```
to get us closer to a common level IR and simplify the implementation of relative-addressing.
To achieve this, the fields in the sm4 register structs are replaced with fields in the vkd3d-shader structs or removed altogether, one at the time.
As can be seen when looking at the whole branch, it is possible to do this transformation without having to add additional fields to `struct vkd3d_shader_register`, by restricting each register type to a single `enum vkd3d_sm4_dimension` (and its src registers to a single `enum vkd3d_sm4_swizzle_type`) by default.
The only exception we need so far is for sampler registers: They always have dimension NONE, except when used as arguments of gather instructions, in which case they have dimension VEC4 and SCALAR swizzle. This, and similar exceptions we may find in the future, can be handled using the opcode_info, as in 7/8 [81e17506](https://gitlab.winehq.org/fcasas/vkd3d/-/commit/81e17506ba2cb1fbf….
--
v5: vkd3d-shader/tpf: Get rid of sm4_register.dim.
vkd3d-shader/tpf: Use 's' in src_info to expect sampler register with vec4 dimension.
vkd3d-shader/tpf: Separate dst register write function.
vkd3d-shader/tpf: Separate src register write function.
vkd3d-shader/tpf: Make register_type_table an array of structs with lookup tables.
vkd3d-shader/tpf: Introduce struct tpf_writer to group sm4 write arguments.
vkd3d-shader/tpf: Use struct vkd3d_shader_register_index in sm4_register.idx[].
vkd3d-shader/tpf: Allow passing NULL register type on hlsl_sm4_register_from_semantic().
vkd3d-shader/tpf: Use enum vkd3d_shader_register_type in sm4_register.type.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/281
--
v2: winedump: Support REG_QWORD values in regf files.
winedump: Support dumping UTF16 value names in regf files.
winedump: Don't dump volatile keys from regf file.
winedump: Enlarge buffer in dump_want_n helper.
winedump: Skip data blocks when dumping regf files.
winedump: Support REG_MULTI_SZ values in regf files.
winedump: Support REG_BINARY values in regf files.
winedump: Support REG_NONE values in regf files.
winedump: Support REG_EXPAND_SZ values in regf files.
winedump: Fix empty string handling in regf files.
winedump: Support dumping default values without VAL_COMP_NAME flag.
winedump: Support REG_DWORD values in regf files.
winedump: Add support for decoding data stored in offset in regf files.
winedump: Fix non null terminated strings printing in regf files.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3477
On Mon Jul 31 08:12:43 2023 +0000, Rémi Bernon wrote:
> Actually I'm not even sure this is very useful. The scheme resolver also
> sets the `MF_BYTESTREAM_CONTENT_TYPE` attributes on Windows (although to
> `audio/x-wav`), which will be enough to not require `MF_RESOLUTION_CONTENT_DOES_NOT_HAVE_TO_MATCH_EXTENSION_OR_MIME_TYPE`.
> On Wine we will miss the content type until we have a better
> implementation of http scheme handlers with access to the response headers.
I've added a generic `MF_BYTESTREAM_CONTENT_TYPE` instead.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3439#note_41020
Let The Good Life able to play its intro video, the game creates a source reader from a `http://localhost:6000/<random-hash>` URL. This should also probably work with other games playing streams over http(s).
This is a very basic implementation, using urlmon, and it will download the entire stream to a local temporary file before playback. An more optimized implementation would probably use WinHttp and range requests to partially download the requested stream segments, but this can be implemented in a future change.
--
v3: mf/scheme_handler: Implement http(s):// scheme handler using urlmon.
mf/scheme_handler: Split file scheme handler to scheme_handler.c.
mf/tests: Add some network scheme resolver tests.
include: Add MF_BYTESTREAM_EFFECTIVE_URL GUID declaration.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3439
This fixes an issue when the path includes non-ASCII characters.
Signed-off-by: Jactry Zeng <jzeng(a)codeweavers.com>
--
v4: mshtml: Call UrlUnescapeW() with URL_UNESCAPE_AS_UTF8 in is_gecko_path().
shlwapi/tests: Test UrlUnescapeW() with URL_UNESCAPE_AS_UTF8.
kernelbase: Implement URL_UNESCAPE_AS_UTF8 for UrlUnescapeW().
shlwapi/tests: Test UrlUnescapeW() with independent data.
https://gitlab.winehq.org/wine/wine/-/merge_requests/585
This is probably the most ugly and controversial bit of API. I don't really know
if this is the right approach to solving this.
sm4 registers match by register index, such that shaders can mostly be compiled
in isolation. sm1 does not—registers may be specified in any order in the vertex
and pixel shaders, and will be matched by usage and usage index.
By itself this is not much of a problem. Where it gets hairy is that we want to
do some degree of caching, as well as pre-compilation, and avoid recompiling
either shader every time it's matched with a new one.
Wine currently deals with this problem, for GLSL, by generating a "main" GLSL
shader for the vertex shader, and then an extra function setup_vs_output(), and
linking the two together every time a new pixel shader is used. This could in
theory be used for SPIR-V, but it requires the use of extra, probably external,
code to link SPIR-V shaders together, which I do not particularly anticipate
being well-received.
(I'm not sure how Wine deals with this problem in the ARB backend. It seems to
take the pixel shader signature into account when generating the vertex shader—
cf. init_output_registers()—but it doesn't take it into account when looking up
a vertex shader variant? I didn't look too closely at the code, so maybe I'm
missing something.)
--
The vkd3d parts of this patch are quite straightforward, and looking at them, I
think the design is quite intuitive in isolation. There may be some room for
internal refactoring (in particular with an eye to not so much overloading
the "register_index" field of struct shader_signature_element) but I'm
relatively happy with the way it turned out. In isolation, that is.
The Wine part is worse. I've uploaded branches for vkd3d and Wine that use this
API, and correctly handle shaders with some nontrivial reordering:
https://gitlab.winehq.org/zfigura/vkd3d/-/commits/himavant5https://gitlab.winehq.org/zfigura/wine/-/commits/himavant_cb
The test can be run, as before, like so:
make tests/shader_runner.exe && WINE_D3D_CONFIG=renderer=vulkan wine tests/shader_runner.exe ../vkd3d/tests/hlsl/sm1-interstage-interface.shader_test
The interesting Wine parts are concentrated in a single patch, 5cfb9d930f11e.
The patch takes a few shortcuts, partly because I wanted not to block the vkd3d
API design questions, but also because while writing it I came up with a couple
of problems that I wasn't sure how to fix. There are two main problems I see:
(1) This patch has the user pass the signature from the pixel shader when
compiling the vertex shader, and looks up register indices already
arbitrarily allocated by the pixel shader. This is problematic when trying
to use this signature as a cache key, by virtue of it not being clear (or
even defined) which fields are key elements and which aren't. It's also not
particularly kind to lookup, on account of not being directly comparable
with memcmp(). There are a few options I see:
(a) Provide an internal function to compare key elements. This feels... odd,
like a very special-purpose function, but perhaps workable.
(b) Just make the user deal with it, and assert that all fields are key
elements.
(c) Use some alternative, perhaps shortened structure as a field of
vkd3d_shader_next_stage_info. This has the disadvantage that it is not
as simple for a hypothetical user to retrieve from the pixel shader, but
we would presumably provide a function to generate one from a shader
signature. This would probably also be kinder to cache lookup if it's
shorter.
(d) Make caching vkd3d's responsibility, to some degree. This seems
daunting, but the more we optimize, the more difficult it may be to
design API that allows for nice caching.
(2) Assuming we use signatures, there is a memory management problem that
5cfb9d930f11e spells out. This is probably a matter of "just fix it", but
I suppose another option is to take the GLSL or ARB architecture.
--
April is the cruellest month, breeding
Lilacs out of the dead land, mixing
Memory and desire, stirring
Dull roots with spring rain.
--
v7: vkd3d-shader: Introduce a function to build a varying map between sm1 stages.
vkd3d-shader/spirv: Make output varyings not consumed by the next stage private variables.
vkd3d-shader: Implement remapping shader output registers to match the next shader's semantics.
vkd3d-shader: Add a separate field for the target location of a signature element.
vkd3d-shader/d3dbc: Make sure all inter-stage varyings have a unique register index.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/280
This is probably the most ugly and controversial bit of API. I don't really know
if this is the right approach to solving this.
sm4 registers match by register index, such that shaders can mostly be compiled
in isolation. sm1 does not—registers may be specified in any order in the vertex
and pixel shaders, and will be matched by usage and usage index.
By itself this is not much of a problem. Where it gets hairy is that we want to
do some degree of caching, as well as pre-compilation, and avoid recompiling
either shader every time it's matched with a new one.
Wine currently deals with this problem, for GLSL, by generating a "main" GLSL
shader for the vertex shader, and then an extra function setup_vs_output(), and
linking the two together every time a new pixel shader is used. This could in
theory be used for SPIR-V, but it requires the use of extra, probably external,
code to link SPIR-V shaders together, which I do not particularly anticipate
being well-received.
(I'm not sure how Wine deals with this problem in the ARB backend. It seems to
take the pixel shader signature into account when generating the vertex shader—
cf. init_output_registers()—but it doesn't take it into account when looking up
a vertex shader variant? I didn't look too closely at the code, so maybe I'm
missing something.)
--
The vkd3d parts of this patch are quite straightforward, and looking at them, I
think the design is quite intuitive in isolation. There may be some room for
internal refactoring (in particular with an eye to not so much overloading
the "register_index" field of struct shader_signature_element) but I'm
relatively happy with the way it turned out. In isolation, that is.
The Wine part is worse. I've uploaded branches for vkd3d and Wine that use this
API, and correctly handle shaders with some nontrivial reordering:
https://gitlab.winehq.org/zfigura/vkd3d/-/commits/himavant5https://gitlab.winehq.org/zfigura/wine/-/commits/himavant_cb
The test can be run, as before, like so:
make tests/shader_runner.exe && WINE_D3D_CONFIG=renderer=vulkan wine tests/shader_runner.exe ../vkd3d/tests/hlsl/sm1-interstage-interface.shader_test
The interesting Wine parts are concentrated in a single patch, 5cfb9d930f11e.
The patch takes a few shortcuts, partly because I wanted not to block the vkd3d
API design questions, but also because while writing it I came up with a couple
of problems that I wasn't sure how to fix. There are two main problems I see:
(1) This patch has the user pass the signature from the pixel shader when
compiling the vertex shader, and looks up register indices already
arbitrarily allocated by the pixel shader. This is problematic when trying
to use this signature as a cache key, by virtue of it not being clear (or
even defined) which fields are key elements and which aren't. It's also not
particularly kind to lookup, on account of not being directly comparable
with memcmp(). There are a few options I see:
(a) Provide an internal function to compare key elements. This feels... odd,
like a very special-purpose function, but perhaps workable.
(b) Just make the user deal with it, and assert that all fields are key
elements.
(c) Use some alternative, perhaps shortened structure as a field of
vkd3d_shader_next_stage_info. This has the disadvantage that it is not
as simple for a hypothetical user to retrieve from the pixel shader, but
we would presumably provide a function to generate one from a shader
signature. This would probably also be kinder to cache lookup if it's
shorter.
(d) Make caching vkd3d's responsibility, to some degree. This seems
daunting, but the more we optimize, the more difficult it may be to
design API that allows for nice caching.
(2) Assuming we use signatures, there is a memory management problem that
5cfb9d930f11e spells out. This is probably a matter of "just fix it", but
I suppose another option is to take the GLSL or ARB architecture.
--
April is the cruellest month, breeding
Lilacs out of the dead land, mixing
Memory and desire, stirring
Dull roots with spring rain.
--
v6: vkd3d-shader: Introduce a function to build a varying map between sm1 stages.
vkd3d-shader/spirv: Make output varyings not consumed by the next stage private variables.
vkd3d-shader: Implement remapping shader output registers to match the next shader's semantics.
vkd3d-shader: Add a separate field for the target location of a signature element.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/280
i) Introduce a helper for calculating transform properties.
ii) Re-use get_log_fontW to populate LOGFONTW from GpFont.
Signed-off-by: David Kahurani k.kahurani(a)gmail.com
--
v4: gdiplus: Use get_log_fontW in GdipGetLogFontW
gdiplus: Use helper to calculate transform properties
https://gitlab.winehq.org/wine/wine/-/merge_requests/3471
1. Subtracting integers may result in an overflow or underflow.
2. Right at the 'edge' of underflowing, the result of subtraction may be
`INT_MIN`, and the call to `abs()` will also result in `INT_MIN`.
This fix accounts for all of these conditions.
EXAMPLES:
1. can be encountered by comparing 2.0 and -2.0
2. can be encountered by comparing -2.0 and 2.0
NOTE:
There are 14 more instances of `compare_float` across several modules.
I would like to make sure the logic is sound with this MR, then I can take on the rest.
--
v3: ddraw/tests: Fix overflow bugs in compare_float.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3458
This is first part in a series that implements Cycle Collection for every mshtml object (dispex) and cleans up rest of the code based on it, which is obviously needed due to dynamic props and other extra object-specific refs.
In an effort to split it up as much as possible, since it already has quite a lot of restructuring and changes, some of the earlier patches will introduce temporary leaks or cyclic refs, but that's because we'll later handle them properly with the dispex CC. These shouldn't affect behavior, though, so it shouldn't pose problems for functionality.
Nodes are, initially, not changed much (other than to make it compatible with the dispex) to keep changes as small as possible. They still use their own CC mechanism and refcounting, which is hackish but that is solved in a follow-up MR, so it's temporary only.
Eventually, every object (including nodes) will use the dispex's vtbl to do its Cycle Collection, except for stuff like outer window (which is a special case).
In this first part, the objects that are using the node CC will have no-op dispex CC methods since they are using the node's, but this is temporary only.
v2: Now the entire first part will be split up into several MRs. This only moves destruction/unlinking of dispex into separate vtbl methods, without actually using any of the CC yet. `release_dispex` will now call the unlink and destructor, if available (it won't be optional later, but for now it is), so only those converted objects make use of it.
Most of the patches are small and typically convert one object at a time, except the first 4. More will follow up in subsequent MRs.
--
v3: mshtml: Use unlink and destructor in the vtbl for the MutationObserver
mshtml: Use unlink and destructor in the vtbl for HTMLXMLHttpRequestFactory.
mshtml: Use unlink and destructor in the vtbl for HTMLOptionElementFactory.
mshtml: Use unlink and destructor in the vtbl for HTMLImageElementFactory.
mshtml: Use unlink and destructor in the vtbl for HTMLStyleSheet.
mshtml: Use unlink and destructor in the vtbl for HTMLStyleSheetsCollection.
mshtml: Use unlink and destructor in the vtbl for
mshtml: Use unlink and destructor in the vtbl for HTMLStyleSheetRule.
mshtml: Use unlink and destructor in the vtbl for CSSStyle.
mshtml: Use unlink and destructor in the vtbl for inner windows.
mshtml: Use unlink and destructor in the vtbl for HTMLEventObj.
mshtml: Use separate dispex destructors for different event types.
mshtml: Use unlink and destructor in the vtbl for function disps.
mshtml: Introduce unlink_ref helper.
mshtml: Use the common HTMLElement dispex vtbl in the dispex definitions.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3408
This is first part in a series that implements Cycle Collection for every mshtml object (dispex) and cleans up rest of the code based on it, which is obviously needed due to dynamic props and other extra object-specific refs.
In an effort to split it up as much as possible, since it already has quite a lot of restructuring and changes, some of the earlier patches will introduce temporary leaks or cyclic refs, but that's because we'll later handle them properly with the dispex CC. These shouldn't affect behavior, though, so it shouldn't pose problems for functionality.
Nodes are, initially, not changed much (other than to make it compatible with the dispex) to keep changes as small as possible. They still use their own CC mechanism and refcounting, which is hackish but that is solved in a follow-up MR, so it's temporary only.
Eventually, every object (including nodes) will use the dispex's vtbl to do its Cycle Collection, except for stuff like outer window (which is a special case).
In this first part, the objects that are using the node CC will have no-op dispex CC methods since they are using the node's, but this is temporary only.
v2: Now the entire first part will be split up into several MRs. This only moves destruction/unlinking of dispex into separate vtbl methods, without actually using any of the CC yet. `release_dispex` will now call the unlink and destructor, if available (it won't be optional later, but for now it is), so only those converted objects make use of it.
Most of the patches are small and typically convert one object at a time, except the first 4. More will follow up in subsequent MRs.
--
v2: mshtml: Use unlink and destructor in the vtbl for the MutationObserver
mshtml: Use unlink and destructor in the vtbl for HTMLXMLHttpRequestFactory.
mshtml: Use unlink and destructor in the vtbl for HTMLOptionElementFactory.
mshtml: Use unlink and destructor in the vtbl for HTMLImageElementFactory.
mshtml: Use unlink and destructor in the vtbl for HTMLStyleSheet.
mshtml: Use unlink and destructor in the vtbl for HTMLStyleSheetsCollection.
mshtml: Use unlink and destructor in the vtbl for
mshtml: Use unlink and destructor in the vtbl for HTMLStyleSheetRule.
mshtml: Use unlink and destructor in the vtbl for CSSStyle.
mshtml: Use unlink and destructor in the vtbl for inner windows.
mshtml: Use unlink and destructor in the vtbl for HTMLEventObj.
mshtml: Use separate dispex destructors for different event types.
mshtml: Use unlink and destructor in the vtbl for function disps.
mshtml: Introduce unlink_ref helper.
mshtml: Use the common HTMLElement dispex vtbl in the dispex definitions.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3408
Particularly it implements '/upn', '/logonid' and '/user' options,
where the first one uses newly added NameUserPrincipal format in GetUserNameExW.
--
v2: whoami: Refactor and add more commands.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3473
i) Introduce a helper for calculating transform properties.
ii) Re-use get_log_fontW to populate LOGFONTW from GpFont.
Signed-off-by: David Kahurani k.kahurani(a)gmail.com
--
v2: gdiplus: Use get_log_fontW in GdipGetLogFontW
gdiplus: Use helper to calculate transform properties
https://gitlab.winehq.org/wine/wine/-/merge_requests/3471
i) Introduce a helper for calculating transform properties.
ii) Re-use get_log_fontW to populate LOGFONTW from GpFont.
Signed-off-by: David Kahurani k.kahurani(a)gmail.com
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3471
DirectX's FFP normalizes the zero vector to the zero vector, but GLSL
normalizes them to all NaN.
This patch creates `ffp_normalize`, which normalizes vectors but
has DirectX's behavior on 0 vectors. Further patches in this set switch
other calls from GLSL's `normalize` to the 0-safe version.
This fixes [36564](https://bugs.winehq.org/show_bug.cgi?id=36564).
--
v3: wined3d: Use ffp_normalize in shader_glsl_ffp_vertex_lighting_footer.
wined3d: Use ffp_normalize in shader_glsl_ffp_vertex_lighting.
wined3d: Implement a zero-safe normalize function for FFP.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3419
1. Subtracting integers may result in an overflow or underflow.
2. Right at the 'edge' of underflowing, the result of subtraction may be
`INT_MIN`, and the call to `abs()` will also result in `INT_MIN`.
This fix accounts for all of these conditions.
EXAMPLES:
1. can be encountered by comparing 2.0 and -2.0
2. can be encountered by comparing -2.0 and 2.0
NOTE:
There are 14 more instances of `compare_float` across several modules.
I would like to make sure the logic is sound with this MR, then I can take on the rest.
--
v2: ddraw/tests: Fix overflow bugs in compare_float.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3458
1. Subtracting integers may result in an overflow or underflow.
2. Right at the 'edge' of underflowing, the result of subtraction may be
`INT_MIN`, and the call to `abs()` will also result in `INT_MIN`.
This fix accounts for all of these conditions.
EXAMPLES:
1. can be encountered by comparing 2.0 and -2.0
2. can be encountered by comparing -2.0 and 2.0
NOTE:
There are 14 more instances of `compare_float` across several modules.
I would like to make sure the logic is sound with this MR, then I can take on the rest.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3458
This is probably the most ugly and controversial bit of API. I don't really know
if this is the right approach to solving this.
sm4 registers match by register index, such that shaders can mostly be compiled
in isolation. sm1 does not—registers may be specified in any order in the vertex
and pixel shaders, and will be matched by usage and usage index.
By itself this is not much of a problem. Where it gets hairy is that we want to
do some degree of caching, as well as pre-compilation, and avoid recompiling
either shader every time it's matched with a new one.
Wine currently deals with this problem, for GLSL, by generating a "main" GLSL
shader for the vertex shader, and then an extra function setup_vs_output(), and
linking the two together every time a new pixel shader is used. This could in
theory be used for SPIR-V, but it requires the use of extra, probably external,
code to link SPIR-V shaders together, which I do not particularly anticipate
being well-received.
(I'm not sure how Wine deals with this problem in the ARB backend. It seems to
take the pixel shader signature into account when generating the vertex shader—
cf. init_output_registers()—but it doesn't take it into account when looking up
a vertex shader variant? I didn't look too closely at the code, so maybe I'm
missing something.)
--
The vkd3d parts of this patch are quite straightforward, and looking at them, I
think the design is quite intuitive in isolation. There may be some room for
internal refactoring (in particular with an eye to not so much overloading
the "register_index" field of struct shader_signature_element) but I'm
relatively happy with the way it turned out. In isolation, that is.
The Wine part is worse. I've uploaded branches for vkd3d and Wine that use this
API, and correctly handle shaders with some nontrivial reordering:
https://gitlab.winehq.org/zfigura/vkd3d/-/commits/himavant5https://gitlab.winehq.org/zfigura/wine/-/commits/himavant_cb
The test can be run, as before, like so:
make tests/shader_runner.exe && WINE_D3D_CONFIG=renderer=vulkan wine tests/shader_runner.exe ../vkd3d/tests/hlsl/sm1-interstage-interface.shader_test
The interesting Wine parts are concentrated in a single patch, 5cfb9d930f11e.
The patch takes a few shortcuts, partly because I wanted not to block the vkd3d
API design questions, but also because while writing it I came up with a couple
of problems that I wasn't sure how to fix. There are two main problems I see:
(1) This patch has the user pass the signature from the pixel shader when
compiling the vertex shader, and looks up register indices already
arbitrarily allocated by the pixel shader. This is problematic when trying
to use this signature as a cache key, by virtue of it not being clear (or
even defined) which fields are key elements and which aren't. It's also not
particularly kind to lookup, on account of not being directly comparable
with memcmp(). There are a few options I see:
(a) Provide an internal function to compare key elements. This feels... odd,
like a very special-purpose function, but perhaps workable.
(b) Just make the user deal with it, and assert that all fields are key
elements.
(c) Use some alternative, perhaps shortened structure as a field of
vkd3d_shader_next_stage_info. This has the disadvantage that it is not
as simple for a hypothetical user to retrieve from the pixel shader, but
we would presumably provide a function to generate one from a shader
signature. This would probably also be kinder to cache lookup if it's
shorter.
(d) Make caching vkd3d's responsibility, to some degree. This seems
daunting, but the more we optimize, the more difficult it may be to
design API that allows for nice caching.
(2) Assuming we use signatures, there is a memory management problem that
5cfb9d930f11e spells out. This is probably a matter of "just fix it", but
I suppose another option is to take the GLSL or ARB architecture.
--
April is the cruellest month, breeding
Lilacs out of the dead land, mixing
Memory and desire, stirring
Dull roots with spring rain.
--
v5: vkd3d-shader: Introduce a function to build a varying map between sm1 stages.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/280