This removes 20 `movaps` instructions from every syscall that calls a sysv_abi function, plus an `and` for stack alignment and some other instructions depending on the function.
In `NtAllocateLocallyUniqueId` for example this reduces the number of instructions from 63 to 36.
I don't entirely understand the llvm-mca output but here are the before and after stats that it outputs for that function:
Before
Iterations: 100
Instructions: 6300
Total Cycles: 3335
Total uOps: 6300
Dispatch Width: 6
uOps Per Cycle: 1.89
IPC: 1.89
Block RThroughput: 15.0
After
Iterations: 100
Instructions: 3600
Total Cycles: 1514
Total uOps: 3600
Dispatch Width: 6
uOps Per Cycle: 2.38
IPC: 2.38
Block RThroughput: 6.0
This currently depends on the stack being aligned by the syscall dispatcher, which afaict is the case if `sizeof(struct syscall_frame) % 16 == 0`. If that is not good enough I can add an `andq $~15,%rsp` somewhere.
One question I have is whether we want to continue supporting CDECL syscalls (only `wine_server_call`, `wine_server_fd_to_handle` and `wine_server_handle_to_fd`)?
If we do, this adds a bit of complexity to the syscall dispatcher, see the commit "FIXUP ntdll: Support CDECL syscalls."
If we don't, and make those syscalls WINAPI instead, then for every call to those functions on x86 it seems to either change nothing or add one `add` instruction. However we of course lose the ability to make CDECL syscalls.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1752
--
v2: wined3d: Enable long types in texture.c.
wined3d: Reduce usage of long integral types in texture.c.
wined3d: Change stencil parameter type in blitter_clear() method.
wined3d: Get/set texture's level_count and lod as unsigned int.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1727
We are currently not initializing static values to zero by default.
Consider the following shader:
```hlsl
static float4 va;
float4 main() : sv_target
{
return va;
}
```
we get the following output:
```
ps_5_0
dcl_output o0.xyzw
dcl_temps 2
mov r0.xyzw, r1.xyzw
mov o0.xyzw, r0.xyzw
ret
```
where r1.xyzw is not initialized.
This patch solves this by assigning the static variable the value of an
uint 0, and thus, relying on complex broadcasts.
This seems to be the behaviour of the the native compiler, since it retrieves
the following error on a shader that lacks an initializer on a data type with
object components:
```
error X3017: cannot convert from 'uint' to 'struct <unnamed>'
```
--
v4: vkd3d-shader/hlsl: Allow uninitialized static objects.
vkd3d-shader/hlsl: Validate that non-uniform objects are not referenced.
tests: Add additional object references tests.
tests: Test proper initialization of static structs to zero.
vkd3d-shader/hlsl: Initialize static variables to 0 by default.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/54