Now that we can write HLSL intrinsics in HLSL, cleaning up missing functions becomes much easier.
This commit also extends the trigonometry tests a little bit to make sure that tan works right.
-- v3: vkd3d-shader/hlsl: Implement intrinsic tan.
From: Petrichor Park ppark@codeweavers.com
This commit also extends the trigonometry tests a little bit to make sure that tan works right. --- libs/vkd3d-shader/hlsl.y | 14 +++++++++ tests/hlsl/trigonometry.shader_test | 49 +++++++++++++++++++---------- 2 files changed, 46 insertions(+), 17 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 161d1ab42..d620dbd5d 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -3419,6 +3419,19 @@ static bool intrinsic_step(struct hlsl_ctx *ctx, return !!add_implicit_conversion(ctx, params->instrs, ge, type, loc); }
+static bool intrinsic_tan(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *arg = params->args[0], *sin, *cos; + if (!(sin = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_SIN, arg, loc))) + return false; + + if (!(cos = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_COS, arg, loc))) + return false; + + return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_DIV, sin, cos, loc); +} + static bool intrinsic_tex(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc, const char *name, enum hlsl_sampler_dim dim) { @@ -3652,6 +3665,7 @@ intrinsic_functions[] = {"smoothstep", 3, true, intrinsic_smoothstep}, {"sqrt", 1, true, intrinsic_sqrt}, {"step", 2, true, intrinsic_step}, + {"tan", 1, true, intrinsic_tan}, {"tex2D", -1, false, intrinsic_tex2D}, {"tex3D", -1, false, intrinsic_tex3D}, {"texCUBE", -1, false, intrinsic_texCUBE}, diff --git a/tests/hlsl/trigonometry.shader_test b/tests/hlsl/trigonometry.shader_test index 09933d3b7..03e62c145 100644 --- a/tests/hlsl/trigonometry.shader_test +++ b/tests/hlsl/trigonometry.shader_test @@ -8,27 +8,27 @@ void main(out float tex : texcoord, inout float4 pos : sv_position) float4 main(float tex : texcoord) : sv_target { tex = floor(tex + 0.25); - return float4(sin(tex), cos(tex), 0, 0); + return float4(sin(tex), cos(tex), tan(tex), 0); }
[test] draw quad -probe ( 0, 0) rgba ( 0.00000000, 1.00000000, 0.0, 0.0) -probe ( 1, 0) rgba ( 0.84147098, 0.54030231, 0.0, 0.0) 1024 -probe ( 2, 0) rgba ( 0.90929743, -0.41614684, 0.0, 0.0) 1024 -probe ( 3, 0) rgba ( 0.14112001, -0.98999250, 0.0, 0.0) 1024 -probe ( 4, 0) rgba (-0.75680250, -0.65364362, 0.0, 0.0) 1024 -probe ( 5, 0) rgba (-0.95892427, 0.28366219, 0.0, 0.0) 1024 -probe ( 6, 0) rgba (-0.27941550, 0.96017029, 0.0, 0.0) 1024 -probe ( 7, 0) rgba ( 0.65698660, 0.75390225, 0.0, 0.0) 1024 -probe ( 8, 0) rgba ( 0.98935825, -0.14550003, 0.0, 0.0) 1024 -probe ( 9, 0) rgba ( 0.41211849, -0.91113026, 0.0, 0.0) 1024 -probe (10, 0) rgba (-0.54402111, -0.83907153, 0.0, 0.0) 1024 -probe (11, 0) rgba (-0.99999021, 0.00442570, 0.0, 0.0) 2048 -probe (12, 0) rgba (-0.53657292, 0.84385396, 0.0, 0.0) 1024 -probe (13, 0) rgba ( 0.42016704, 0.90744678, 0.0, 0.0) 1024 -probe (14, 0) rgba ( 0.99060736, 0.13673722, 0.0, 0.0) 1024 -probe (15, 0) rgba ( 0.65028784, -0.75968791, 0.0, 0.0) 1024 +probe ( 0, 0) rgba ( 0.00000000, 1.00000000, 0.00000000, 0.0) +probe ( 1, 0) rgba ( 0.84147098, 0.54030231, 1.55740772, 0.0) 1024 +probe ( 2, 0) rgba ( 0.90929743, -0.41614684, -2.18503986, 0.0) 1024 +probe ( 3, 0) rgba ( 0.14112001, -0.98999250, -0.14254654, 0.0) 1024 +probe ( 4, 0) rgba (-0.75680250, -0.65364362, 1.15782128, 0.0) 1024 +probe ( 5, 0) rgba (-0.95892427, 0.28366219, -3.38051501, 0.0) 1024 +probe ( 6, 0) rgba (-0.27941550, 0.96017029, -0.29100619, 0.0) 1024 +probe ( 7, 0) rgba ( 0.65698660, 0.75390225, 0.87144798, 0.0) 1024 +probe ( 8, 0) rgba ( 0.98935825, -0.14550003, -6.79971146, 0.0) 1024 +probe ( 9, 0) rgba ( 0.41211849, -0.91113026, -0.45231566, 0.0) 1024 +probe (10, 0) rgba (-0.54402111, -0.83907153, 0.64836083, 0.0) 1024 +probe (11, 0) rgba (-0.99999021, 0.00442570, -225.95084645, 0.0) 2048 +probe (12, 0) rgba (-0.53657292, 0.84385396, -0.63585993, 0.0) 1024 +probe (13, 0) rgba ( 0.42016704, 0.90744678, 0.46302113, 0.0) 1024 +probe (14, 0) rgba ( 0.99060736, 0.13673722, 7.24460662, 0.0) 1024 +probe (15, 0) rgba ( 0.65028784, -0.75968791, -0.85599340, 0.0) 1024
[pixel shader] @@ -57,3 +57,18 @@ float4 main() : sv_target uniform 0 float4 0.0 0.78539816 1.57079632 2.35619449 draw quad probe all rgba (1000.0, 707.0, -0.0, -707.0) + + +[pixel shader] +uniform float4 a; + +float4 main() : sv_target +{ + return round(1000 * tan(a)); +} + +[test] +uniform 0 float4 0.0 0.78539816 1.57079632 2.35619449 +draw quad +probe all rgba (0, 1000, 1e99, -1000.0) +% use 1e99 as a standin for infinity
On Fri Sep 1 10:25:54 2023 +0000, Giovanni Mascellani wrote:
A workaround to parse INF is to enter a very large number. I think some test has `1e99` or so.
Though it turns out that it's just better to avoid it: on WARP I get this: ``` shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0). ```
On Fri Sep 1 15:18:28 2023 +0000, Giovanni Mascellani wrote:
Though it turns out that it's just better to avoid it: on WARP I get this:
shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0).
Yeah, pi/2 is the asymptote point right? So I think I'm going to leave a comment explaining it and just go back to not checking there.
Though it turns out that it's just better to avoid it: on WARP I get this:
shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0).
Is that sm1 or sm4? I'd expect sm1 to be a bit looser, but sm4 has more specified floating point rules and may be more consistent.
On Fri Sep 1 16:59:51 2023 +0000, Zebediah Figura wrote:
Though it turns out that it's just better to avoid it: on WARP I get this:
shader_runner:648: Section [test], line 70: Test failed: Got
{0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0).
Is that sm1 or sm4? I'd expect sm1 to be a bit looser, but sm4 has more specified floating point rules and may be more consistent.
That's the complete output: ``` shader_runner:1260: Running tests from a Windows cross build shader_runner:1262: Compiling shaders with d3dcompiler_47.dll and executing with d3d9.dll shader_runner:95: Driver string: d3d10warp.dll. shader_runner:96: Device: Microsoft Basic Render Driver, 1414:008c. shader_runner:99: Using WARP device. shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, 1.11848115e+009, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0). shader_runner:1265: Compiling shaders with d3dcompiler_47.dll and executing with d3d11.dll shader_runner:164: Adapter: Microsoft Basic Render Driver, 1414:008c. shader_runner:168: Using WARP device. shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, -2.28773335e+010, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0). shader_runner:1268: Compiling shaders with d3dcompiler_47.dll and executing with d3d12.dll shader_runner:340: Adapter: Microsoft Basic Render Driver, 1414:008c. shader_runner:344: Using WARP device. shader_runner:648: Section [test], line 70: Test failed: Got {0.00000000e+000, 1.00000000e+003, -2.28773335e+010, -1.00000000e+003}, expected {0.00000000e+000, 1.00000000e+003, 1.#INF0000e+000, -1.00000000e+003} at (0, 0). shader_runner:1238: d3dcompiler_47.dll version: 10.0.22621.4 shader_runner:1238: dxgi.dll version: 10.0.22621.317 shader_runner:1238: d3d9.dll version: 10.0.22621.317 shader_runner:1238: d3d11.dll version: 10.0.22621.317 shader_runner:1238: d3d12.dll version: 10.0.22621.317 shader_runner: 538 tests executed (3 failures, 0 skipped, 0 todo, 0 bugs). ```
So SM1 has a result and SM4 has another. I'm not sure of what exactly is mandated by IEEE, and whether SM4 is really expected to fully comply with IEEE. Notice that we're not really evaluating on pi/2, because that's not representable as a floating point number. So giving a finite, however large, doesn't look wrong.
So SM1 has a result and SM4 has another. I'm not sure of what exactly is mandated by IEEE, and whether SM4 is really expected to fully comply with IEEE. Notice that we're not really evaluating on pi/2, because that's not representable as a floating point number. So giving a finite, however large, doesn't look wrong.
That also makes sense, although it wouldn't surprise me if software expected sinf(M_PI) to really return 0, and so on. So yeah, no objection from me to just leaving out the degenerate case.
On Fri Sep 1 17:58:40 2023 +0000, Zebediah Figura wrote:
So SM1 has a result and SM4 has another. I'm not sure of what exactly
is mandated by IEEE, and whether SM4 is really expected to fully comply with IEEE. Notice that we're not really evaluating on pi/2, because that's not representable as a floating point number. So giving a finite, however large, doesn't look wrong. That also makes sense, although it wouldn't surprise me if software expected sinf(M_PI) to really return 0, and so on. So yeah, no objection from me to just leaving out the degenerate case.
It's enough of an edge case that I can't imagine the behavior is consistent across platforms even with real d3d.