[PATCH v3 0/1] MR321: vkd3d-shader/hlsl: Implement intrinsic tan.
Now that we can write HLSL intrinsics in HLSL, cleaning up missing functions becomes much easier. This commit also extends the trigonometry tests a little bit to make sure that tan works right. -- v3: vkd3d-shader/hlsl: Implement intrinsic tan. https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321
From: Petrichor Park <ppark(a)codeweavers.com> This commit also extends the trigonometry tests a little bit to make sure that tan works right. --- libs/vkd3d-shader/hlsl.y | 14 +++++++++ tests/hlsl/trigonometry.shader_test | 49 +++++++++++++++++++---------- 2 files changed, 46 insertions(+), 17 deletions(-) diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 161d1ab42..d620dbd5d 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -3419,6 +3419,19 @@ static bool intrinsic_step(struct hlsl_ctx *ctx, return !!add_implicit_conversion(ctx, params->instrs, ge, type, loc); } +static bool intrinsic_tan(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *arg = params->args[0], *sin, *cos; + if (!(sin = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_SIN, arg, loc))) + return false; + + if (!(cos = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_COS, arg, loc))) + return false; + + return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_DIV, sin, cos, loc); +} + static bool intrinsic_tex(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc, const char *name, enum hlsl_sampler_dim dim) { @@ -3652,6 +3665,7 @@ intrinsic_functions[] = {"smoothstep", 3, true, intrinsic_smoothstep}, {"sqrt", 1, true, intrinsic_sqrt}, {"step", 2, true, intrinsic_step}, + {"tan", 1, true, intrinsic_tan}, {"tex2D", -1, false, intrinsic_tex2D}, {"tex3D", -1, false, intrinsic_tex3D}, {"texCUBE", -1, false, intrinsic_texCUBE}, diff --git a/tests/hlsl/trigonometry.shader_test b/tests/hlsl/trigonometry.shader_test index 09933d3b7..03e62c145 100644 --- a/tests/hlsl/trigonometry.shader_test +++ b/tests/hlsl/trigonometry.shader_test @@ -8,27 +8,27 @@ void main(out float tex : texcoord, inout float4 pos : sv_position) float4 main(float tex : texcoord) : sv_target { tex = floor(tex + 0.25); - return float4(sin(tex), cos(tex), 0, 0); + return float4(sin(tex), cos(tex), tan(tex), 0); } [test] draw quad -probe ( 0, 0) rgba ( 0.00000000, 1.00000000, 0.0, 0.0) -probe ( 1, 0) rgba ( 0.84147098, 0.54030231, 0.0, 0.0) 1024 -probe ( 2, 0) rgba ( 0.90929743, -0.41614684, 0.0, 0.0) 1024 -probe ( 3, 0) rgba ( 0.14112001, -0.98999250, 0.0, 0.0) 1024 -probe ( 4, 0) rgba (-0.75680250, -0.65364362, 0.0, 0.0) 1024 -probe ( 5, 0) rgba (-0.95892427, 0.28366219, 0.0, 0.0) 1024 -probe ( 6, 0) rgba (-0.27941550, 0.96017029, 0.0, 0.0) 1024 -probe ( 7, 0) rgba ( 0.65698660, 0.75390225, 0.0, 0.0) 1024 -probe ( 8, 0) rgba ( 0.98935825, -0.14550003, 0.0, 0.0) 1024 -probe ( 9, 0) rgba ( 0.41211849, -0.91113026, 0.0, 0.0) 1024 -probe (10, 0) rgba (-0.54402111, -0.83907153, 0.0, 0.0) 1024 -probe (11, 0) rgba (-0.99999021, 0.00442570, 0.0, 0.0) 2048 -probe (12, 0) rgba (-0.53657292, 0.84385396, 0.0, 0.0) 1024 -probe (13, 0) rgba ( 0.42016704, 0.90744678, 0.0, 0.0) 1024 -probe (14, 0) rgba ( 0.99060736, 0.13673722, 0.0, 0.0) 1024 -probe (15, 0) rgba ( 0.65028784, -0.75968791, 0.0, 0.0) 1024 +probe ( 0, 0) rgba ( 0.00000000, 1.00000000, 0.00000000, 0.0) +probe ( 1, 0) rgba ( 0.84147098, 0.54030231, 1.55740772, 0.0) 1024 +probe ( 2, 0) rgba ( 0.90929743, -0.41614684, -2.18503986, 0.0) 1024 +probe ( 3, 0) rgba ( 0.14112001, -0.98999250, -0.14254654, 0.0) 1024 +probe ( 4, 0) rgba (-0.75680250, -0.65364362, 1.15782128, 0.0) 1024 +probe ( 5, 0) rgba (-0.95892427, 0.28366219, -3.38051501, 0.0) 1024 +probe ( 6, 0) rgba (-0.27941550, 0.96017029, -0.29100619, 0.0) 1024 +probe ( 7, 0) rgba ( 0.65698660, 0.75390225, 0.87144798, 0.0) 1024 +probe ( 8, 0) rgba ( 0.98935825, -0.14550003, -6.79971146, 0.0) 1024 +probe ( 9, 0) rgba ( 0.41211849, -0.91113026, -0.45231566, 0.0) 1024 +probe (10, 0) rgba (-0.54402111, -0.83907153, 0.64836083, 0.0) 1024 +probe (11, 0) rgba (-0.99999021, 0.00442570, -225.95084645, 0.0) 2048 +probe (12, 0) rgba (-0.53657292, 0.84385396, -0.63585993, 0.0) 1024 +probe (13, 0) rgba ( 0.42016704, 0.90744678, 0.46302113, 0.0) 1024 +probe (14, 0) rgba ( 0.99060736, 0.13673722, 7.24460662, 0.0) 1024 +probe (15, 0) rgba ( 0.65028784, -0.75968791, -0.85599340, 0.0) 1024 [pixel shader] @@ -57,3 +57,18 @@ float4 main() : sv_target uniform 0 float4 0.0 0.78539816 1.57079632 2.35619449 draw quad probe all rgba (1000.0, 707.0, -0.0, -707.0) + + +[pixel shader] +uniform float4 a; + +float4 main() : sv_target +{ + return round(1000 * tan(a)); +} + +[test] +uniform 0 float4 0.0 0.78539816 1.57079632 2.35619449 +draw quad +probe all rgba (0, 1000, 1e99, -1000.0) +% use 1e99 as a standin for infinity -- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321
On Fri Sep 1 10:25:54 2023 +0000, Giovanni Mascellani wrote:
A workaround to parse INF is to enter a very large number. I think some test has `1e99` or so. Though it turns out that it's just better to avoid it: on WARP I get this:
shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0).
-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321#note_44062
On Fri Sep 1 15:18:28 2023 +0000, Giovanni Mascellani wrote:
Though it turns out that it's just better to avoid it: on WARP I get this: ``` shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0). ``` Yeah, pi/2 is the asymptote point right? So I think I'm going to leave a comment explaining it and just go back to not checking there.
-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321#note_44063
Though it turns out that it's just better to avoid it: on WARP I get this:
``` shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0). ```
Is that sm1 or sm4? I'd expect sm1 to be a bit looser, but sm4 has more specified floating point rules and may be more consistent. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321#note_44066
On Fri Sep 1 16:59:51 2023 +0000, Zebediah Figura wrote:
Though it turns out that it's just better to avoid it: on WARP I get this:
``` shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0). ``` Is that sm1 or sm4? I'd expect sm1 to be a bit looser, but sm4 has more specified floating point rules and may be more consistent. That's the complete output:
shader_runner:1260: Running tests from a Windows cross build
shader_runner:1262: Compiling shaders with d3dcompiler_47.dll and executing with d3d9.dll
shader_runner:95: Driver string: d3d10warp.dll.
shader_runner:96: Device: Microsoft Basic Render Driver, 1414:008c.
shader_runner:99: Using WARP device.
shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0).
shader_runner:1265: Compiling shaders with d3dcompiler_47.dll and executing with d3d11.dll
shader_runner:164: Adapter: Microsoft Basic Render Driver, 1414:008c.
shader_runner:168: Using WARP device.
shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, -2.28773335e+010, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0).
shader_runner:1268: Compiling shaders with d3dcompiler_47.dll and executing with d3d12.dll
shader_runner:340: Adapter: Microsoft Basic Render Driver, 1414:008c.
shader_runner:344: Using WARP device.
shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, -2.28773335e+010, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0).
shader_runner:1238: d3dcompiler_47.dll version: 10.0.22621.4
shader_runner:1238: dxgi.dll version: 10.0.22621.317
shader_runner:1238: d3d9.dll version: 10.0.22621.317
shader_runner:1238: d3d11.dll version: 10.0.22621.317
shader_runner:1238: d3d12.dll version: 10.0.22621.317
shader_runner: 538 tests executed (3 failures, 0 skipped, 0 todo, 0 bugs).
So SM1 has a result and SM4 has another. I'm not sure of what exactly is mandated by IEEE, and whether SM4 is really expected to fully comply with IEEE. Notice that we're not really evaluating on pi/2, because that's not representable as a floating point number. So giving a finite, however large, doesn't look wrong. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321#note_44067
So SM1 has a result and SM4 has another. I'm not sure of what exactly is mandated by IEEE, and whether SM4 is really expected to fully comply with IEEE. Notice that we're not really evaluating on pi/2, because that's not representable as a floating point number. So giving a finite, however large, doesn't look wrong.
That also makes sense, although it wouldn't surprise me if software expected sinf(M_PI) to really return 0, and so on. So yeah, no objection from me to just leaving out the degenerate case. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321#note_44069
On Fri Sep 1 17:58:40 2023 +0000, Zebediah Figura wrote:
So SM1 has a result and SM4 has another. I'm not sure of what exactly is mandated by IEEE, and whether SM4 is really expected to fully comply with IEEE. Notice that we're not really evaluating on pi/2, because that's not representable as a floating point number. So giving a finite, however large, doesn't look wrong. That also makes sense, although it wouldn't surprise me if software expected sinf(M_PI) to really return 0, and so on. So yeah, no objection from me to just leaving out the degenerate case. It's enough of an edge case that I can't imagine the behavior is consistent across platforms even with real d3d.
-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/321#note_44092
participants (4)
-
Giovanni Mascellani (@giomasce) -
Petrichor Park -
Petrichor Park (@petrathekat) -
Zebediah Figura (@zfigura)