[PATCH 0/4] MR602: Draft: vkd3d-shader/dxil: Inverse trigonometry

List overview All Threads

newer

older

[PATCH 0/3] MR599:...

[PATCH v3 0/3] MR589:...

Conor McCarthy (＠cmccarthy)

25 Jan 2024 25 Jan '24

6:33 a.m.

With tests from !364, separated out from the HLSL changes there and updated. This MR can wait until 364 is upstream though.

It is apparently unnecessary to match the SM4/5 implementation, since the AMD Windows results differ. The RADV results are a bit wrong, but Proton uses the SPIR-V GLSL extension instructions too, and no workarounds have been implemented there.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/602

Show replies by date

Petrichor Park

25 Jan 25 Jan

6:33 a.m.

New subject: [PATCH 1/4] tests/shader-runner: Add tests for acos and asin trig intrinsics.

From: Petrichor Park ppark@codeweavers.com

Extracted by Conor McCarthy from an HLSL patch, and modified to include SM 6 variations. --- Makefile.am | 1 + tests/hlsl/inverse-trig.shader_test | 91 +++++++++++++++++++++++++++++ 2 files changed, 92 insertions(+) create mode 100644 tests/hlsl/inverse-trig.shader_test

diff --git a/Makefile.am b/Makefile.am index bfd11fdb4..959d0f6ce 100644 --- a/Makefile.am +++ b/Makefile.am @@ -121,6 +121,7 @@ vkd3d_shader_tests = \ tests/hlsl/initializer-struct.shader_test \ tests/hlsl/intrinsic-override.shader_test \ tests/hlsl/invalid.shader_test \ + tests/hlsl/inverse-trig.shader_test \ tests/hlsl/is-front-face.shader_test \ tests/hlsl/ldexp.shader_test \ tests/hlsl/length.shader_test \ diff --git a/tests/hlsl/inverse-trig.shader_test b/tests/hlsl/inverse-trig.shader_test new file mode 100644 index 000000000..e2661fbaf --- /dev/null +++ b/tests/hlsl/inverse-trig.shader_test @@ -0,0 +1,91 @@ +% Microsoft natively outputs values that are slightly mathematically wrong. +% VKD3D faithfully does the same. +[pixel shader todo] +uniform float4 a; + +float4 main() : sv_target +{ + return float4(acos(a.x), 0.0, 0.0, 0.0); +} + +[test] +uniform 0 float4 -1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (3.14159274, 0.0, 0.0, 0.0) 128 + +uniform 0 float4 -0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (2.094441441, 0.0, 0.0, 0.0) 128 + +uniform 0 float4 0.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (1.57072878, 0.0, 0.0, 0.0) 1024 + +uniform 0 float4 0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (1.04715133, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.0, 0.0, 0.0, 0.0) 128 + +[pixel shader todo] +uniform float4 a; + +float4 main() : sv_target +{ + return float4(asin(a.x), 0.0, 0.0, 0.0); +} + +[test] +uniform 0 float4 -1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (-1.57079637, 0.0, 0.0, 0.0) 128 + +[require] +shader model < 6.0 + +[test] +uniform 0 float4 -0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (-0.523645043, 0.0, 0.0, 0.0) 128 + +% Because sqrt isn't identical across platforms, there is some inaccuracy +% here even with an identical algorithm, and because it's so near zero, +% each ulp is really small. So, in order to pass there needs to be this +% enormous margin. +uniform 0 float4 0.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.0000675916672, 0.0, 0.0, 0.0) 131072 + +uniform 0 float4 0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.523645043, 0.0, 0.0, 0.0) 128 + +[require] +shader model >= 6.0 + +% SM 6.0 has instructions for inverse trig, which we implement using the native +% equivalents available in SPIR-V. The values below are from the AMD Windows +% drivers, which are very close to those from Ubuntu's calculator app. Results +% from RADV are a bit lower than these, hence the large max ulp difference. +[test] +uniform 0 float4 -0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (-0.523598731, 0.0, 0.0, 0.0) 4096 + +uniform 0 float4 0.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.0, 0.0, 0.0, 0.0) 128 + +uniform 0 float4 0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.523598731, 0.0, 0.0, 0.0) 4096 + +[require] +% reset requirements + +[test] +uniform 0 float4 1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (1.57079637, 0.0, 0.0, 0.0) 128

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/602

Petrichor Park

6:33 a.m.

New subject: [PATCH 2/4] tests/shader-runner: Add tests for atan and atan2 trig intrinsics.

From: Petrichor Park ppark@codeweavers.com

Extracted by Conor McCarthy from an HLSL patch, with ulp values doubled in some cases to cover SM 6 results. --- tests/hlsl/inverse-trig.shader_test | 107 ++++++++++++++++++++++++++++ 1 file changed, 107 insertions(+)

diff --git a/tests/hlsl/inverse-trig.shader_test b/tests/hlsl/inverse-trig.shader_test index e2661fbaf..0cf3f793b 100644 --- a/tests/hlsl/inverse-trig.shader_test +++ b/tests/hlsl/inverse-trig.shader_test @@ -89,3 +89,110 @@ probe all rgba (0.523598731, 0.0, 0.0, 0.0) 4096 uniform 0 float4 1.0 0.0 0.0 0.0 todo draw quad probe all rgba (1.57079637, 0.0, 0.0, 0.0) 128 + + +[pixel shader todo] +uniform float4 a; + +float4 main() : sv_target +{ + return float4(atan(a.x), 0.0, 0.0, 0.0); +} + +[test] +uniform 0 float4 -1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (-0.785409629, 0.0, 0.0, 0.0) 512 + +uniform 0 float4 -0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (-0.4636476, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.0, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.5 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.4636476, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (0.785409629, 0.0, 0.0, 0.0) 512 + +[pixel shader todo] +uniform float4 a; + +float4 main() : sv_target +{ + // Because the argument order is (y,x) the numbers are + // passed in "backwards" here, so they're the right way in the + // test cases. + return float4(atan2(a.x, a.y), 0.0, 0.0, 0.0); +} + +[test] +% Non-degenerate cases +uniform 0 float4 1.0 1.0 0.0 0.0 +todo draw quad +probe all rgba (0.785385, 0.0, 0.0, 0.0) 512 + +uniform 0 float4 5.0 -5.0 0.0 0.0 +todo draw quad +probe all rgba (2.356194, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 -3.0 -3.0 0.0 0.0 +todo draw quad +probe all rgba (-2.356194, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (1.570796, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 -1.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (-1.570796, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.0 1.0 0.0 0.0 +todo draw quad +probe all rgba (0.0, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.0 -1.0 0.0 0.0 +todo draw quad +probe all rgba (3.1415927, 0.0, 0.0, 0.0) 256 + +% Degenerate cases +uniform 0 float4 0.00001 0.00002 0.0 0.0 +todo draw quad +probe all rgba (0.463647, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.00001 -0.00002 0.0 0.0 +todo draw quad +probe all rgba (2.677945, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 -0.00001 100000.0 0.0 0.0 +todo draw quad +probe all rgba (-0.000000000099986595, 0.0, 0.0, 0.0) 2048 + +uniform 0 float4 10000000.0 0.00000001 0.0 0.0 +todo draw quad +probe all rgba (1.570796, 0.0, 0.0, 0.0) 256 + +% Negative zero behavior should be to treat it the +% same as normal zero. +uniform 0 float4 1000000000.0 0.0 0.0 0.0 +todo draw quad +probe all rgba (1.570796, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 1000000000.0 -0.0 0.0 0.0 +todo draw quad +probe all rgba (1.570796, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.0 -1.0 0.0 0.0 +todo draw quad +probe all rgba (3.1415927, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 -0.0 -1.0 0.0 0.0 +todo draw quad +probe all rgba (3.1415927, 0.0, 0.0, 0.0) 256 +

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/602

Conor McCarthy

6:33 a.m.

New subject: [PATCH 3/4] vkd3d-shader/dxil: Handle inverse trigonometric functions in sm6_parser_emit_dx_unary().

From: Conor McCarthy cmccarthy@codeweavers.com

--- libs/vkd3d-shader/d3d_asm.c | 3 +++ libs/vkd3d-shader/dxil.c | 12 ++++++++++++ libs/vkd3d-shader/vkd3d_shader_private.h | 3 +++ 3 files changed, 18 insertions(+)

diff --git a/libs/vkd3d-shader/d3d_asm.c b/libs/vkd3d-shader/d3d_asm.c index 6ec7a9c99..2039e0816 100644 --- a/libs/vkd3d-shader/d3d_asm.c +++ b/libs/vkd3d-shader/d3d_asm.c @@ -30,8 +30,11 @@ static const char * const shader_opcode_names[] = { [VKD3DSIH_ABS ] = "abs", + [VKD3DSIH_ACOS ] = "acos", [VKD3DSIH_ADD ] = "add", [VKD3DSIH_AND ] = "and", + [VKD3DSIH_ASIN ] = "asin", + [VKD3DSIH_ATAN ] = "atan", [VKD3DSIH_ATOMIC_AND ] = "atomic_and", [VKD3DSIH_ATOMIC_CMP_STORE ] = "atomic_cmp_store", [VKD3DSIH_ATOMIC_IADD ] = "atomic_iadd", diff --git a/libs/vkd3d-shader/dxil.c b/libs/vkd3d-shader/dxil.c index c089d132c..39fcf671e 100644 --- a/libs/vkd3d-shader/dxil.c +++ b/libs/vkd3d-shader/dxil.c @@ -331,6 +331,9 @@ enum dx_intrinsic_opcode DX_ISNAN = 8, DX_ISINF = 9, DX_ISFINITE = 10, + DX_ACOS = 15, + DX_ASIN = 16, + DX_ATAN = 17, DX_EXP = 21, DX_FRC = 22, DX_LOG = 23, @@ -3503,6 +3506,12 @@ static enum vkd3d_shader_opcode map_dx_unary_op(enum dx_intrinsic_opcode op) return VKD3DSIH_ISINF; case DX_ISFINITE: return VKD3DSIH_ISFINITE; + case DX_ACOS: + return VKD3DSIH_ACOS; + case DX_ASIN: + return VKD3DSIH_ASIN; + case DX_ATAN: + return VKD3DSIH_ATAN; case DX_EXP: return VKD3DSIH_EXP; case DX_FRC: @@ -3831,6 +3840,9 @@ struct sm6_dx_opcode_info */ static const struct sm6_dx_opcode_info sm6_dx_op_table[] = { + [DX_ACOS ] = {"g", "R", sm6_parser_emit_dx_unary}, + [DX_ASIN ] = {"g", "R", sm6_parser_emit_dx_unary}, + [DX_ATAN ] = {"g", "R", sm6_parser_emit_dx_unary}, [DX_BFREV ] = {"m", "R", sm6_parser_emit_dx_unary}, [DX_BUFFER_LOAD ] = {"o", "Hii", sm6_parser_emit_dx_buffer_load}, [DX_CBUFFER_LOAD_LEGACY ] = {"o", "Hi", sm6_parser_emit_dx_cbuffer_load}, diff --git a/libs/vkd3d-shader/vkd3d_shader_private.h b/libs/vkd3d-shader/vkd3d_shader_private.h index 51daf2153..b326eec45 100644 --- a/libs/vkd3d-shader/vkd3d_shader_private.h +++ b/libs/vkd3d-shader/vkd3d_shader_private.h @@ -224,8 +224,11 @@ enum vkd3d_shader_error enum vkd3d_shader_opcode { VKD3DSIH_ABS, + VKD3DSIH_ACOS, VKD3DSIH_ADD, VKD3DSIH_AND, + VKD3DSIH_ASIN, + VKD3DSIH_ATAN, VKD3DSIH_ATOMIC_AND, VKD3DSIH_ATOMIC_CMP_STORE, VKD3DSIH_ATOMIC_IADD,

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/602

Conor McCarthy

6:33 a.m.

New subject: [PATCH 4/4] vkd3d-shader/spirv: Handle the ACOS, ASIN and ATAN instructions in spirv_compiler_emit_ext_glsl_instruction().

From: Conor McCarthy cmccarthy@codeweavers.com

--- libs/vkd3d-shader/spirv.c | 6 +++ tests/hlsl/inverse-trig.shader_test | 60 ++++++++++++++--------------- 2 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/libs/vkd3d-shader/spirv.c b/libs/vkd3d-shader/spirv.c index 299b4e965..11bcc9ae7 100644 --- a/libs/vkd3d-shader/spirv.c +++ b/libs/vkd3d-shader/spirv.c @@ -6959,6 +6959,9 @@ static enum GLSLstd450 spirv_compiler_map_ext_glsl_instruction( } glsl_insts[] = { + {VKD3DSIH_ACOS, GLSLstd450Acos}, + {VKD3DSIH_ASIN, GLSLstd450Asin}, + {VKD3DSIH_ATAN, GLSLstd450Atan}, {VKD3DSIH_DFMA, GLSLstd450Fma}, {VKD3DSIH_DMAX, GLSLstd450NMax}, {VKD3DSIH_DMIN, GLSLstd450NMin}, @@ -9523,6 +9526,9 @@ static int spirv_compiler_handle_instruction(struct spirv_compiler *compiler, case VKD3DSIH_ISFINITE: spirv_compiler_emit_isfinite(compiler, instruction); break; + case VKD3DSIH_ACOS: + case VKD3DSIH_ASIN: + case VKD3DSIH_ATAN: case VKD3DSIH_DFMA: case VKD3DSIH_DMAX: case VKD3DSIH_DMIN: diff --git a/tests/hlsl/inverse-trig.shader_test b/tests/hlsl/inverse-trig.shader_test index 0cf3f793b..d48457901 100644 --- a/tests/hlsl/inverse-trig.shader_test +++ b/tests/hlsl/inverse-trig.shader_test @@ -10,23 +10,23 @@ float4 main() : sv_target

[test] uniform 0 float4 -1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (3.14159274, 0.0, 0.0, 0.0) 128

uniform 0 float4 -0.5 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (2.094441441, 0.0, 0.0, 0.0) 128

uniform 0 float4 0.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (1.57072878, 0.0, 0.0, 0.0) 1024

uniform 0 float4 0.5 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (1.04715133, 0.0, 0.0, 0.0) 256

uniform 0 float4 1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (0.0, 0.0, 0.0, 0.0) 128

[pixel shader todo] @@ -39,7 +39,7 @@ float4 main() : sv_target

[test] uniform 0 float4 -1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (-1.57079637, 0.0, 0.0, 0.0) 128

[require] @@ -71,15 +71,15 @@ shader model >= 6.0 % from RADV are a bit lower than these, hence the large max ulp difference. [test] uniform 0 float4 -0.5 0.0 0.0 0.0 -todo draw quad +draw quad probe all rgba (-0.523598731, 0.0, 0.0, 0.0) 4096

uniform 0 float4 0.0 0.0 0.0 0.0 -todo draw quad +draw quad probe all rgba (0.0, 0.0, 0.0, 0.0) 128

uniform 0 float4 0.5 0.0 0.0 0.0 -todo draw quad +draw quad probe all rgba (0.523598731, 0.0, 0.0, 0.0) 4096

[require] @@ -87,7 +87,7 @@ probe all rgba (0.523598731, 0.0, 0.0, 0.0) 4096

[test] uniform 0 float4 1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (1.57079637, 0.0, 0.0, 0.0) 128

@@ -101,23 +101,23 @@ float4 main() : sv_target

[test] uniform 0 float4 -1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (-0.785409629, 0.0, 0.0, 0.0) 512

uniform 0 float4 -0.5 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (-0.4636476, 0.0, 0.0, 0.0) 256

uniform 0 float4 0.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (0.0, 0.0, 0.0, 0.0) 256

uniform 0 float4 0.5 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (0.4636476, 0.0, 0.0, 0.0) 256

uniform 0 float4 1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (0.785409629, 0.0, 0.0, 0.0) 512

[pixel shader todo] @@ -134,65 +134,65 @@ float4 main() : sv_target [test] % Non-degenerate cases uniform 0 float4 1.0 1.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (0.785385, 0.0, 0.0, 0.0) 512

uniform 0 float4 5.0 -5.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (2.356194, 0.0, 0.0, 0.0) 256

uniform 0 float4 -3.0 -3.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (-2.356194, 0.0, 0.0, 0.0) 256

uniform 0 float4 1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (1.570796, 0.0, 0.0, 0.0) 256

uniform 0 float4 -1.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (-1.570796, 0.0, 0.0, 0.0) 256

uniform 0 float4 0.0 1.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (0.0, 0.0, 0.0, 0.0) 256

uniform 0 float4 0.0 -1.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (3.1415927, 0.0, 0.0, 0.0) 256

% Degenerate cases uniform 0 float4 0.00001 0.00002 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (0.463647, 0.0, 0.0, 0.0) 256

uniform 0 float4 0.00001 -0.00002 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (2.677945, 0.0, 0.0, 0.0) 256

uniform 0 float4 -0.00001 100000.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (-0.000000000099986595, 0.0, 0.0, 0.0) 2048

uniform 0 float4 10000000.0 0.00000001 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (1.570796, 0.0, 0.0, 0.0) 256

% Negative zero behavior should be to treat it the % same as normal zero. uniform 0 float4 1000000000.0 0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (1.570796, 0.0, 0.0, 0.0) 256

uniform 0 float4 1000000000.0 -0.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (1.570796, 0.0, 0.0, 0.0) 256

uniform 0 float4 0.0 -1.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (3.1415927, 0.0, 0.0, 0.0) 256

uniform 0 float4 -0.0 -1.0 0.0 0.0 -todo draw quad +todo(sm<6) draw quad probe all rgba (3.1415927, 0.0, 0.0, 0.0) 256

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/602

Henri Verbeet (＠hverbeet)

2:59 p.m.

...

With tests from !364, separated out from the HLSL changes there and updated. This MR can wait until 364 is upstream though.

I'm not quite sure what the current ETA for that MR is; the expedient thing to do is likely to upstream the tests as part of this MR.

...

+% Microsoft natively outputs values that are slightly mathematically wrong.
+% VKD3D faithfully does the same.

"vkd3d-shader", and Microsoft doesn't necessarily output any values here. d3dcompiler's HLSL compiler does generate the shader code of course, but from that comment it's not clear to me whether that code is inaccurate, or whether that's a result of the hardware or drivers. Ultimately we don't particularly care.

What we do care about is whether the results are consistent across GPUs and drivers for a particular shader model.

...

+[pixel shader todo]
+uniform float4 a;
+
+float4 main() : sv_target
+{
+    // Because the argument order is (y,x) the numbers are
+    // passed in "backwards" here, so they're the right way in the
+    // test cases.
+    return float4(atan2(a.x, a.y), 0.0, 0.0, 0.0);
+}

I don't get it.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/602#note_58997

535

Age (days ago)

535

Last active (days ago)

wine-gitlab@winehq.org

5 comments

4 participants

tags (0)

participants (4)

Conor McCarthy
Conor McCarthy (＠cmccarthy)
Henri Verbeet (＠hverbeet)
Petrichor Park