-- v6: vkd3d-shader/hlsl: Support distance() intrinsic. vkd3d-shader/hlsl: Support rsqrt() intrinsic.
From: Nikolay Sivov nsivov@codeweavers.com
--- libs/vkd3d-shader/hlsl.y | 12 ++++++++++++ tests/sqrt.shader_test | 13 +++++++++++++ 2 files changed, 25 insertions(+)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index d5e2b2a9..218bbc44 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2785,6 +2785,17 @@ static bool intrinsic_round(struct hlsl_ctx *ctx, return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_ROUND, arg, loc); }
+static bool intrinsic_rsqrt(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *arg; + + if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[0], loc))) + return false; + + return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_RSQ, arg, loc); +} + static bool intrinsic_saturate(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -2983,6 +2994,7 @@ intrinsic_functions[] = {"normalize", 1, true, intrinsic_normalize}, {"pow", 2, true, intrinsic_pow}, {"round", 1, true, intrinsic_round}, + {"rsqrt", 1, true, intrinsic_rsqrt}, {"saturate", 1, true, intrinsic_saturate}, {"sin", 1, true, intrinsic_sin}, {"smoothstep", 3, true, intrinsic_smoothstep}, diff --git a/tests/sqrt.shader_test b/tests/sqrt.shader_test index 5d048b4f..78d89d38 100644 --- a/tests/sqrt.shader_test +++ b/tests/sqrt.shader_test @@ -10,3 +10,16 @@ float4 main() : sv_target uniform 0 float4 1.0 9.0 32.3 46.5 draw quad probe all rgba (1.0, 3.0, 5.683309, 6.819091) 1 + +[pixel shader] +uniform float4 f; + +float4 main() : sv_target +{ + return rsqrt(f); +} + +[test] +uniform 0 float4 1.0 9.0 4.0 16.0 +draw quad +probe all rgba (1.0, 0.33333333, 0.5, 0.25) 1
From: Nikolay Sivov nsivov@codeweavers.com
--- Makefile.am | 1 + libs/vkd3d-shader/hlsl.y | 18 ++++++++++++++++++ tests/distance.shader_test | 14 ++++++++++++++ 3 files changed, 33 insertions(+) create mode 100644 tests/distance.shader_test
diff --git a/Makefile.am b/Makefile.am index f9199472..822906c8 100644 --- a/Makefile.am +++ b/Makefile.am @@ -59,6 +59,7 @@ vkd3d_shader_tests = \ tests/cbuffer.shader_test \ tests/compute.shader_test \ tests/conditional.shader_test \ + tests/distance.shader_test \ tests/entry-point-semantics.shader_test \ tests/exp.shader_test \ tests/floor.shader_test \ diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 218bbc44..cd8b44a1 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2492,6 +2492,23 @@ static bool intrinsic_cross(struct hlsl_ctx *ctx, return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, mul2, mul1_neg, loc); }
+static bool intrinsic_distance(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *neg, *add, *dot; + + if (!(neg = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_NEG, params->args[1], loc))) + return false; + + if (!(add = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, params->args[0], neg, loc))) + return false; + + if (!(dot = add_binary_dot_expr(ctx, params->instrs, add, add, loc))) + return false; + + return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_SQRT, dot, loc); +} + static bool intrinsic_dot(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -2980,6 +2997,7 @@ intrinsic_functions[] = {"clamp", 3, true, intrinsic_clamp}, {"cos", 1, true, intrinsic_cos}, {"cross", 2, true, intrinsic_cross}, + {"distance", 2, true, intrinsic_distance}, {"dot", 2, true, intrinsic_dot}, {"exp", 1, true, intrinsic_exp}, {"exp2", 1, true, intrinsic_exp2}, diff --git a/tests/distance.shader_test b/tests/distance.shader_test new file mode 100644 index 00000000..29794fcc --- /dev/null +++ b/tests/distance.shader_test @@ -0,0 +1,14 @@ +[pixel shader] +uniform float4 x; +uniform float4 y; + +float4 main() : sv_target +{ + return distance(x, y); +} + +[test] +uniform 0 float4 -2.0 3.0 4.0 0.1 +uniform 4 float4 2.0 -1.0 4.0 5.0 +draw quad +probe all rgba (7.483983, 7.483983, 7.483983, 7.483983) 1
On Mon Jan 30 12:07:39 2023 +0000, Nikolay Sivov wrote:
This is blocked by a missing dot() actually, so it should get in first.
A fix for dot() got in, and the test was added as well.
On Mon Jan 30 18:56:13 2023 +0000, Zebediah Figura wrote:
What Giovanni said, although it's kind of a mix of both reasons. We don't want to match native compiler output exactly, but it's also useful to test the lower level compilers (for our purposes, dxbc -> spirv/glsl, but Mesa has also expressed interest in running our tests.)
At least for rsqrt() results depend on shader model, for d3d9 input is going through abs() equivalent when executed, so it's never invalid. For d3d11 output contains NaNs. Detecting that with isnan() won't work, because that's not compiled reliably. We could probably use some inline isnan() logic, but still there is no way currently to pick different test result. I understand the importance of getting a good coverage, but that goes beyond implementing builtin functions. I don't think one that should block another.
On Mon Feb 13 20:42:01 2023 +0000, Nikolay Sivov wrote:
At least for rsqrt() results depend on shader model, for d3d9 input is going through abs() equivalent when executed, so it's never invalid. For d3d11 output contains NaNs. Detecting that with isnan() won't work, because that's not compiled reliably. We could probably use some inline isnan() logic, but still there is no way currently to pick different test result. I understand the importance of getting a good coverage, but that goes beyond implementing builtin functions. I don't think one that should block another.
I don't think it should block this patch, no.
On Mon Feb 13 21:53:38 2023 +0000, Zebediah Figura wrote:
I don't think it should block this patch, no.
I am not sure I understand the problem. As you said, you can reimplement `isnan()` with something like: ```hlsl bool4 xisnan(float4 x) { return (asuint(x) & 0x7FFFFFFF) > 0x7F800000; } ``` (adapted from [this tweet](https://twitter.com/_humus_/status/1074973351276371968) and [this shader playground](https://shader-playground.timjones.io/0bbe04704caddadb52cd4134daab8ac3) mentioned by Zeb)
Or you can just pass the NaN to `asuint()`, so we can also test the actual NaN bits.
Also, you can constrain the test to SM4 using: ``` [require] shader model >= 4.0 ``` on the test with negative numbers.
It's not important if we still do not support `asuint()` yet. You can leave the test `todo`, but at least have it pinpoint the expected outcome.
This doesn't seem to be particularly difficult or long. Or am I missing something?
On Tue Feb 14 11:21:22 2023 +0000, Giovanni Mascellani wrote:
I am not sure I understand the problem. As you said, you can reimplement `isnan()` with something like:
bool4 xisnan(float4 x) { return (asuint(x) & 0x7FFFFFFF) > 0x7F800000; }
(adapted from [this tweet](https://twitter.com/_humus_/status/1074973351276371968) and [this shader playground](https://shader-playground.timjones.io/0bbe04704caddadb52cd4134daab8ac3) mentioned by Zeb) Or you can just pass the NaN to `asuint()`, so we can also test the actual NaN bits. Also, you can constrain the test to SM4 using:
[require] shader model >= 4.0
on the test with negative numbers. It's not important if we still do not support `asuint()` yet. You can leave the test `todo`, but at least have it pinpoint the expected outcome. This doesn't seem to be particularly difficult or long. Or am I missing something?
I'd be happy to look into that, if this is the approach we want to take. It's more about improving existing testing tools, and not the problem this MR is for.
I don't know if "require" ends up a correct name for this functionality, shader does not require specific version, results do. Limiting to >= 4.0 is also insufficient, because that does not show the difference with SM3 and below, which takes absolute value of the input internally.
On Tue Feb 14 18:20:21 2023 +0000, Nikolay Sivov wrote:
I'd be happy to look into that, if this is the approach we want to take. It's more about improving existing testing tools, and not the problem this MR is for. I don't know if "require" ends up a correct name for this functionality, shader does not require specific version, results do. Limiting to >= 4.0 is also insufficient, because that does not show the difference with SM3 and below, which takes absolute value of the input internally.
I think it'd be better to test that the render target contains inf/nan, unless that differs or is impossible for some reason. I have tests in my local tree that do that for sm4 at least.
I don't know if "require" ends up a correct name for this functionality, shader does not require specific version, results do. Limiting to >= 4.0 is also insufficient, because that does not show the difference with SM3 and below, which takes absolute value of the input internally.
Eh, yeah, although [require] can be put anywhere, e.g. before the [test] block. It's maybe not the best name, but it doesn't seem awful; the point is just to restrict the platforms we run those tests on, for whatever reason.
On Tue Feb 14 18:41:43 2023 +0000, Zebediah Figura wrote:
I think it'd be better to test that the render target contains inf/nan, unless that differs or is impossible for some reason. I have tests in my local tree that do that for sm4 at least.
I don't know if "require" ends up a correct name for this
functionality, shader does not require specific version, results do. Limiting to >= 4.0 is also insufficient, because that does not show the difference with SM3 and below, which takes absolute value of the input internally. Eh, yeah, although [require] can be put anywhere, e.g. before the [test] block. It's maybe not the best name, but it doesn't seem awful; the point is just to restrict the platforms we run those tests on, for whatever reason.
Yes, having actual values in expected results is better, if it's reliable. Require could mean "restrict to", so it's not bad. I can probably add support for "< 4.0" if that does not work already, and use that for d3d9.
Currently, `distance()` fails when passing `int` parameters: ``` int a, b;
float4 main() : sv_target { return distance(a, b); } ``` ``` vkd3d/libs/vkd3d-shader/hlsl_sm4.c:1800: write_sm4_expr: Assertion `type_is_float(dst_type)' failed. Aborted (core dumped) ```
it is worth noting that `add_binary_arithmetic_expr()` results in a value of the common type (which is `int` if both arguments are `int`).