--
v7: vkd3d-shader/spirv: Handle thread group UAV barriers.
vkd3d-shader/spirv: Include Uniform in the memory semantics for UAV barriers.
vkd3d-shader/spirv: Handle globally coherent UAVs.
vkd3d-shader: Introduce a UAV_GLOBALLY_COHERENT descriptor info flag.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/306
Normally I am a strong proponent of adjusting implementations one small step at
a time, but this is a bit of an extreme case.
The current state of UrlCanonicalize() is a prototypical example of "spaghetti
code": the implementation has been repeatedly adjusted to fix a new discovered
edge case, without properly testing the scope of the new logic, with the effect
that the current logic is a scattered mess of conditions that interact in
unpredictable ways.
To be fair, the actual function is much more complicated than one would
anticipate, and the number of new weird edge cases I found while trying to flesh
out existing logic was rather exhausting.
I initially tried to "fix" the existing implementation one step at a time. After
racking up dozens of commits without an end in sight, though, and having to
waste time carefully unravelling the broken code in the right order so as to
avoid temporary failures, I decided instead to just fix everything at once,
effectively rewriting the function from scratch, and this proved to be much more
productive.
Hopefully the huge swath of new tests is enough to prove that this new
implementation really is correct, and has no more spaghetti than what Microsoft
made necessary.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/4954
--
v13: tests/d3d12: Test multiple clip distance inputs in test_clip_distance().
tests/d3d12: Use five clip distances for the multiple test in test_clip_distance().
vkd3d-shader/ir: Transform clip/cull outputs and patch constants into arrays.
vkd3d-shader/ir: Transform clip/cull inputs into an array.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/564
For temporary registers, SM1-SM3 integer types are internally
represented as floating point, so, in order to perform a cast
from ints to floats we need a mere MOV.
By the same token, casts from floats to ints can also be implemented with a FLOOR + MOV,
where the FLOOR is then lowered by the lower_floor() pass.
For constant integer registers "iN" there is no operation for casting
from a floating point register to them. For address registers "aN", and
the loop counting register "aL", vertex shaders have the "mova" operation
but we haven't used these registers in any way yet.
We probably would want to introduce these as synthetic variables
allocated in a special register set. In that case we have to remember to
use MOVA instead of MOV in the store operations, but they shouldn't be src
or dst of CAST operations.
Regarding constant integer registers, in some shaders, constants are
expected to be received formatted as an integer, such as:
int m;
float4 main() : sv_target
{
float4 res = {0, 0, 0, 0};
for (int k = 0; k < m; ++k)
res += k;
return res;
}
which compiles as:
// Registers:
//
// Name Reg Size
// ------------ ----- ----
// m i0 1
//
ps_3_0
def c0, 0, 1, 0, 0
mov r0, c0.x
mov r1.x, c0.x
rep i0
add r0, r0, r1.x
add r1.x, r1.x, c0.y
endrep
mov oC0, r0
but this only happens if the integer constant is used directly in an
instruction that needs it, and as I said there is no instruction that
allows converting them to a float representation.
Notice how a more complex shader, that performs operations with this
integer variable "m":
int m;
float4 main() : sv_target
{
float4 res = {0, 0, 0, 0};
for (int k = 0; k < m * m; ++k)
res += k;
return res;
}
gives the following output:
// Registers:
//
// Name Reg Size
// ------------ ----- ----
// m c0 1
//
ps_3_0
def c1, 0, 0, 1, 0
defi i0, 255, 0, 0, 0
mul r0.x, c0.x, c0.x
mov r1, c1.y
mov r0.y, c1.y
rep i0
mov r0.z, r0.x
break_ge r0.y, r0.z
add r1, r0.y, r1
add r0.y, r0.y, c1.z
endrep
mov oC0, r1
Meaning that the uniform "m" is just stored as a floating point in
"c0", the constant integer register "i0" is just set to 255 (hoping
it is a high enough value) using "defi", and the "break_ge"
involving c0 is used to break from the loop.
We could potentially use this approach to implement loops from SM3
without expecting the variables being received as constant integer
registers.
According to the D3D documentation, for SM1-SM3 constant integer
registers are only used by the 'loop' and 'rep' instructions.
--
v2: vkd3d-shader/hlsl: Lower casts to int for SM1.
tests: Add simple test for implicit cast to int.
vkd3d-shader/d3dbc: Implement casts from ints to floats as a MOV.
tests: Remove [require] directives for tests that use int and bool uniforms.
tests/shader-runner: Pass bool uniforms as IEEE 754 floats to SM1 profiles.
tests/shader-runner: Pass int uniforms as IEEE 754 floats to SM1 profiles.
tests/shader-runner: Introduce "only" qualifier.
tests: Don't ignore SM1 on a non-const-indexing.shader_test test.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/608