[PATCH v3 0/3] MR364: vkd3d-shader/hlsl: Implement inverse trigonometry.

List overview All Threads

newer

older

[PATCH v5 0/1] MR366:...

[PATCH 0/1] MR4252: user32/tests:...

Petrichor Park (＠petrathekat)

17 Oct 2023 17 Oct '23

12:37 a.m.

Implements asin, acos, atan, and atan2.

Also includes some tests in a new test file.

One possible problem here is that I'm not sure how to test what Microsoft's atan and atan2 outputs are in boundary cases like atan2(1, 0). I've made the test suites adhere with the calculator program I've been using (Qalculate, which I assume is using libc's atan2).

Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=55154

-- v3: vkd3d-shader/hlsl: Implement atan2. vkd3d-shader/hlsl: Implement atan.

https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364

Show replies by date

Petrichor Park

17 Oct 17 Oct

12:37 a.m.

New subject: [PATCH v3 1/3] vkd3d-shader/hlsl: Implement acos and asin trig intrinsics.

From: Petrichor Park ppark@codeweavers.com

Also includes some tests in a new test file.

Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=55154 --- Makefile.am | 1 + libs/vkd3d-shader/hlsl.y | 62 +++++++++++++++++++++++++++++ tests/hlsl/inverse-trig.shader_test | 30 ++++++++++++++ 3 files changed, 93 insertions(+) create mode 100644 tests/hlsl/inverse-trig.shader_test

diff --git a/Makefile.am b/Makefile.am index 744e46127..39294833c 100644 --- a/Makefile.am +++ b/Makefile.am @@ -109,6 +109,7 @@ vkd3d_shader_tests = \ tests/hlsl/initializer-struct.shader_test \ tests/hlsl/intrinsic-override.shader_test \ tests/hlsl/invalid.shader_test \ + tests/hlsl/inverse-trig.shader_test \ tests/hlsl/is-front-face.shader_test \ tests/hlsl/ldexp.shader_test \ tests/hlsl/length.shader_test \ diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index fb6d485ea..09428e2c5 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2525,6 +2525,60 @@ static bool intrinsic_abs(struct hlsl_ctx *ctx, return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_ABS, params->args[0], loc); }

+static bool write_acos_or_asin(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc, bool asin_mode) +{ + struct hlsl_ir_function_decl *func; + struct hlsl_type *type; + char *body; + + static const char template[] = + "%s %s(%s x)\n" + "{\n" + " %s abs_arg = abs(x);\n" + " %s correction = sqrt(1.0 - abs_arg);\n" + " %s result = correction * (((-0.019963376f\n" + " * abs_arg + 0.07612093f)\n" + " * abs_arg - 0.21274031f)\n" + " * abs_arg + 1.5707964f\n);" + /* Piecewise, (x >= 0) ? result : pi - result. */ + " %s mask = x >= 0.0;\n" + " result = mask * result\n" + " + (1.0 - mask) * (3.14159265 - result);\n" + " return %s;" + "}"; + static const char fn_name_acos[] = "acos"; + static const char fn_name_asin[] = "asin"; + static const char return_stmt_acos[] = "result"; + static const char return_stmt_asin[] = "-result + (3.14159265 / 2.0)"; + + const char *fn_name = asin_mode + ? fn_name_asin + : fn_name_acos; + + type = params->args[0]->data_type; + type = hlsl_get_numeric_type(ctx, type->class, HLSL_TYPE_FLOAT, type->dimx, type->dimy); + + if (!(body = hlsl_sprintf_alloc(ctx, template, + type->name, fn_name, type->name, + type->name, type->name, + type->name, type->name, + (asin_mode ? return_stmt_asin : return_stmt_acos)))) + return false; + func = hlsl_compile_internal_function(ctx, fn_name, body); + vkd3d_free(body); + if (!func) + return false; + + return add_user_call(ctx, func, params, loc); +} + +static bool intrinsic_acos(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + return write_acos_or_asin(ctx, params, loc, false); +} + static bool intrinsic_all(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -2602,6 +2656,12 @@ static bool intrinsic_any(struct hlsl_ctx *ctx, return false; }

+static bool intrinsic_asin(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + return write_acos_or_asin(ctx, params, loc, true); +} + /* Find the type corresponding to the given source type, with the same * dimensions but a different base type. */ static struct hlsl_type *convert_numeric_type(const struct hlsl_ctx *ctx, @@ -3655,9 +3715,11 @@ intrinsic_functions[] = /* Note: these entries should be kept in alphabetical order. */ {"D3DCOLORtoUBYTE4", 1, true, intrinsic_d3dcolor_to_ubyte4}, {"abs", 1, true, intrinsic_abs}, + {"acos", 1, true, intrinsic_acos}, {"all", 1, true, intrinsic_all}, {"any", 1, true, intrinsic_any}, {"asfloat", 1, true, intrinsic_asfloat}, + {"asin", 1, true, intrinsic_asin}, {"asuint", -1, true, intrinsic_asuint}, {"clamp", 3, true, intrinsic_clamp}, {"clip", 1, true, intrinsic_clip}, diff --git a/tests/hlsl/inverse-trig.shader_test b/tests/hlsl/inverse-trig.shader_test new file mode 100644 index 000000000..7be7c5e7d --- /dev/null +++ b/tests/hlsl/inverse-trig.shader_test @@ -0,0 +1,30 @@ +[pixel shader] +uniform float4 a; + +float4 main() : sv_target +{ + // Add 10 to avoid epsilon problems near 0 + // Note that this means -8.5odd means -pi/2 + return float4(asin(a.x), acos(a.y), 0.0, 0) + 10.0; +} + +[test] +uniform 0 float4 -1.0 -1.0 -1.0 0.0 +draw quad +probe all rgba (8.429203673, 13.141592654, 10.0, 10.0) 256 + +uniform 0 float4 -0.5 -0.5 -0.5 0.0 +draw quad +probe all rgba (9.476401224, 12.09439510, 10.0, 10.0) 256 + +uniform 0 float4 0.0 0.0 0.0 0.0 +draw quad +probe all rgba (10.0, 11.570796327, 10.0, 10.0) 256 + +uniform 0 float4 0.5 0.5 0.5 0.0 +draw quad +probe all rgba (10.52359878, 11.04719755, 10.0, 10.0) 256 + +uniform 0 float4 1.0 1.0 1.0 0.0 +draw quad +probe all rgba (11.570796327, 10.0, 10.0, 10.0) 256

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364

Petrichor Park

12:37 a.m.

New subject: [PATCH v3 2/3] vkd3d-shader/hlsl: Implement atan.

From: Petrichor Park ppark@codeweavers.com

--- libs/vkd3d-shader/hlsl.y | 48 +++++++++++++++++++++++++++++ tests/hlsl/inverse-trig.shader_test | 10 +++--- 2 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 09428e2c5..62b119575 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2662,6 +2662,53 @@ static bool intrinsic_asin(struct hlsl_ctx *ctx, return write_acos_or_asin(ctx, params, loc, true); }

+static bool intrinsic_atan(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_function_decl *func; + struct hlsl_type *type; + char *body; + + static const char template[] = + "%s atan(%s x)\n" + "{\n" + " %s abs_arg = abs(x);\n" + " %s input = (abs_arg < 1.0) ? x : 1.0 / x;\n" + /* HLSL pow(x, y) always returns NaN if x<0. So write out the powers longhand. */ + " %s x2 = input * input;\n" + " %s x3 = input * x2;\n" + " %s x5 = x3 * x2;\n" + " %s x7 = x5 * x2;\n" + " %s x9 = x7 * x2;\n" + " %s poly_approx = \n" + " 0.020265052 * x9\n" + " + -0.0840074 * x7\n" + " + 0.17941668 * x5\n" + " + -0.3301321 * x3\n" + " + 0.99985593 * input;\n" + " return (abs_arg < 1.0)\n" + " ? poly_approx\n" + " : (((x < 0.0 ? -1 : 1) * 3.14159265 / 2.0) - poly_approx);\n" + "}"; + + if (!(type = elementwise_intrinsic_get_common_type(ctx, params, loc))) + return false; + type = hlsl_get_numeric_type(ctx, type->class, HLSL_TYPE_FLOAT, type->dimx, type->dimy); + + + if (!(body = hlsl_sprintf_alloc(ctx, template, type->name, type->name, + type->name, type->name, + type->name, type->name, type->name, type->name, type->name, + type->name))) + return false; + func = hlsl_compile_internal_function(ctx, "atan", body); + vkd3d_free(body); + if (!func) + return false; + + return add_user_call(ctx, func, params, loc); +} + /* Find the type corresponding to the given source type, with the same * dimensions but a different base type. */ static struct hlsl_type *convert_numeric_type(const struct hlsl_ctx *ctx, @@ -3721,6 +3768,7 @@ intrinsic_functions[] = {"asfloat", 1, true, intrinsic_asfloat}, {"asin", 1, true, intrinsic_asin}, {"asuint", -1, true, intrinsic_asuint}, + {"atan", 1, true, intrinsic_atan}, {"clamp", 3, true, intrinsic_clamp}, {"clip", 1, true, intrinsic_clip}, {"cos", 1, true, intrinsic_cos}, diff --git a/tests/hlsl/inverse-trig.shader_test b/tests/hlsl/inverse-trig.shader_test index 7be7c5e7d..8820ec002 100644 --- a/tests/hlsl/inverse-trig.shader_test +++ b/tests/hlsl/inverse-trig.shader_test @@ -5,17 +5,17 @@ float4 main() : sv_target { // Add 10 to avoid epsilon problems near 0 // Note that this means -8.5odd means -pi/2 - return float4(asin(a.x), acos(a.y), 0.0, 0) + 10.0; + return float4(asin(a.x), acos(a.y), atan(a.z), 0) + 10.0; }

[test] uniform 0 float4 -1.0 -1.0 -1.0 0.0 draw quad -probe all rgba (8.429203673, 13.141592654, 10.0, 10.0) 256 +probe all rgba (8.429203673, 13.141592654, 9.214601837, 10.0) 256

uniform 0 float4 -0.5 -0.5 -0.5 0.0 draw quad -probe all rgba (9.476401224, 12.09439510, 10.0, 10.0) 256 +probe all rgba (9.476401224, 12.09439510, 9.536352391, 10.0) 256

uniform 0 float4 0.0 0.0 0.0 0.0 draw quad @@ -23,8 +23,8 @@ probe all rgba (10.0, 11.570796327, 10.0, 10.0) 256

uniform 0 float4 0.5 0.5 0.5 0.0 draw quad -probe all rgba (10.52359878, 11.04719755, 10.0, 10.0) 256 +probe all rgba (10.52359878, 11.04719755, 10.46364761, 10.0) 256

uniform 0 float4 1.0 1.0 1.0 0.0 draw quad -probe all rgba (11.570796327, 10.0, 10.0, 10.0) 256 +probe all rgba (11.570796327, 10.0, 10.7853981634, 10.0) 256

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364

Petrichor Park

12:37 a.m.

New subject: [PATCH v3 3/3] vkd3d-shader/hlsl: Implement atan2.

From: Petrichor Park ppark@codeweavers.com

--- libs/vkd3d-shader/hlsl.y | 37 +++++++++++++++++++++++++ tests/hlsl/inverse-trig.shader_test | 43 +++++++++++++++++++++++++++++ 2 files changed, 80 insertions(+)

diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 62b119575..b39c6d320 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2709,6 +2709,42 @@ static bool intrinsic_atan(struct hlsl_ctx *ctx, return add_user_call(ctx, func, params, loc); }

+static bool intrinsic_atan2(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_function_decl *func; + struct hlsl_type *type; + char *body; + + static const char template[] = + "%s atan2(%s y, %s x)\n" + "{\n" + " %s pi = %s(3.1415926);\n" + " %s offset = x >= 0.0\n" + " ? 0.0\n" + " : y >= 0.0\n" + " ? pi\n" + " : -pi;\n" + " return atan(y/x) + offset;\n" + "}"; + + if (!(type = elementwise_intrinsic_get_common_type(ctx, params, loc))) + return false; + type = hlsl_get_numeric_type(ctx, type->class, HLSL_TYPE_FLOAT, type->dimx, type->dimy); + + if (!(body = hlsl_sprintf_alloc(ctx, template, + type->name, type->name, type->name, + type->name, type->name, type->name))) + return false; + func = hlsl_compile_internal_function(ctx, "atan2", body); + vkd3d_free(body); + if (!func) + return false; + + return add_user_call(ctx, func, params, loc); +} + + /* Find the type corresponding to the given source type, with the same * dimensions but a different base type. */ static struct hlsl_type *convert_numeric_type(const struct hlsl_ctx *ctx, @@ -3769,6 +3805,7 @@ intrinsic_functions[] = {"asin", 1, true, intrinsic_asin}, {"asuint", -1, true, intrinsic_asuint}, {"atan", 1, true, intrinsic_atan}, + {"atan2", 2, true, intrinsic_atan2}, {"clamp", 3, true, intrinsic_clamp}, {"clip", 1, true, intrinsic_clip}, {"cos", 1, true, intrinsic_cos}, diff --git a/tests/hlsl/inverse-trig.shader_test b/tests/hlsl/inverse-trig.shader_test index 8820ec002..493019984 100644 --- a/tests/hlsl/inverse-trig.shader_test +++ b/tests/hlsl/inverse-trig.shader_test @@ -28,3 +28,46 @@ probe all rgba (10.52359878, 11.04719755, 10.46364761, 10.0) 256 uniform 0 float4 1.0 1.0 1.0 0.0 draw quad probe all rgba (11.570796327, 10.0, 10.7853981634, 10.0) 256 + + +[pixel shader] +uniform float4 a; + +float4 main() : sv_target +{ + // Return things in terms of pi to make the numbers in the + // test cases more understandable. + // Also, because the argument order is (y,x) the numbers are + // passed in "backwards" here, so they're the right way in the + // test cases. + return float4(atan2(a.x, a.y), 0.0, 0.0, 0.0) / 3.14159265; +} + +[test] +uniform 0 float4 1.0 1.0 0.0 0.0 +draw quad +probe all rgba (0.25, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 5.0 -5.0 0.0 0.0 +draw quad +probe all rgba (0.75, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 -3.0 -3.0 0.0 0.0 +draw quad +probe all rgba (-0.75, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 1.0 0.0 0.0 0.0 +draw quad +probe all rgba (0.5, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 -1.0 0.0 0.0 0.0 +draw quad +probe all rgba (-0.5, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.0 1.0 0.0 0.0 +draw quad +probe all rgba (0.0, 0.0, 0.0, 0.0) 256 + +uniform 0 float4 0.0 -1.0 0.0 0.0 +draw quad +probe all rgba (1.0, 0.0, 0.0, 0.0) 256

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364

Zebediah Figura (＠zfigura)

20 Oct 20 Oct

7:42 p.m.

New subject: [PATCH v3 0/3] MR364: vkd3d-shader/hlsl: Implement inverse trigonometry. - approved

This merge request was approved by Zebediah Figura.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364

Nikolay Sivov (＠nsivov)

30 Oct 30 Oct

12:41 p.m.

This needs to be updated for dxcompiler, currently I get 12 test failures with it from inverse-trig test.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50160

Giovanni Mascellani (＠giomasce)

2 Nov 2 Nov

9:17 a.m.

On Thu Nov 2 09:17:28 2023 +0000, Nikolay Sivov wrote:

...

This needs to be updated for dxcompiler, currently I get 12 test failures with it from inverse-trig test.

Yes, this should probably rebased on top of `master` to have a new CI pipeline run. The CI pipeline has changed a lot lately: I realize it can be frustrating to keep up with it, but it's really useful to catch bugs early and harmonize all the current vkd3d development directions.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50597

Giovanni Mascellani (＠giomasce)

9:46 a.m.

Giovanni Mascellani (@giomasce) commented about libs/vkd3d-shader/hlsl.y:

...

struct hlsl_type *type;
char *body;
static const char template[] =
```
       "%s %s(%s x)\n"
```
```
       "{\n"
```
```
       "    %s abs_arg = abs(x);\n"
```

       "    %s correction = sqrt(1.0 - abs_arg);\n"

       "    %s result = correction * (((-0.019963376f\n"

       "        * abs_arg + 0.07612093f)\n"

       "        * abs_arg - 0.21274031f)\n"

       "        * abs_arg + 1.5707964f\n);"

       /* Piecewise, (x >= 0) ? result : pi - result. */

```
       "    %s mask = x >= 0.0;\n"
```
```
       "    result = mask * result\n"
```

       "        + (1.0 - mask) * (3.14159265 - result);\n"

Please see https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_48247.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50598

Giovanni Mascellani (＠giomasce)

9:46 a.m.

Giovanni Mascellani (@giomasce) commented about libs/vkd3d-shader/hlsl.y:

...

```
       "    %s abs_arg = abs(x);\n"
```

       "    %s correction = sqrt(1.0 - abs_arg);\n"

       "    %s result = correction * (((-0.019963376f\n"

       "        * abs_arg + 0.07612093f)\n"

       "        * abs_arg - 0.21274031f)\n"

       "        * abs_arg + 1.5707964f\n);"

       /* Piecewise, (x >= 0) ? result : pi - result. */

```
       "    %s mask = x >= 0.0;\n"
```
```
       "    result = mask * result\n"
```

       "        + (1.0 - mask) * (3.14159265 - result);\n"

```
       "    return %s;"
```
```
       "}";
```
static const char fn_name_acos[] = "acos";
static const char fn_name_asin[] = "asin";
static const char return_stmt_acos[] = "result";
static const char return_stmt_asin[] = "-result + (3.14159265 / 2.0)";

In the same spirit, just use `1.570796f`.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50599

Giovanni Mascellani (＠giomasce)

9:46 a.m.

Giovanni Mascellani (@giomasce) commented about tests/hlsl/inverse-trig.shader_test:

...

+[pixel shader] +uniform float4 a;

+float4 main() : sv_target +{

// Add 10 to avoid epsilon problems near 0

// Note that this means -8.5odd means -pi/2

return float4(asin(a.x), acos(a.y), 0.0, 0) + 10.0;

+}

+[test] +uniform 0 float4 -1.0 -1.0 -1.0 0.0 +draw quad +probe all rgba (8.429203673, 13.141592654, 10.0, 10.0) 256

Are you sure you need this amount of tolerance, once you use the coefficients and formulae from the native compiler?

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50600

Giovanni Mascellani (＠giomasce)

9:46 a.m.

Giovanni Mascellani (@giomasce) commented about libs/vkd3d-shader/hlsl.y:

...

       "    %s input = (abs_arg < 1.0) ? x : 1.0 / x;\n"

       /* HLSL pow(x, y) always returns NaN if x<0. So write out the powers longhand. */

```
       "    %s x2 = input * input;\n"
```
```
       "    %s x3 = input * x2;\n"
```
```
       "    %s x5 =    x3 * x2;\n"
```
```
       "    %s x7 =    x5 * x2;\n"
```
```
       "    %s x9 =    x7 * x2;\n"
```
```
       "    %s poly_approx = \n"
```
```
       "        0.020265052 * x9\n"
```
```
       "        + -0.0840074 * x7\n"
```
```
       "        + 0.17941668 * x5\n"
```
```
       "        + -0.3301321 * x3\n"
```

       "        + 0.99985593 * input;\n"

```
       "    return (abs_arg < 1.0)\n"
```
```
       "        ? poly_approx\n"
```

       "        : (((x < 0.0 ? -1 : 1) * 3.14159265 / 2.0) - poly_approx);\n"

Here too please use the same formula and coefficients as native, see for example https://shader-playground.timjones.io/b97163caaa9e111291dc673f7c9060cb.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50601

Giovanni Mascellani (＠giomasce)

9:46 a.m.

Giovanni Mascellani (@giomasce) commented about tests/hlsl/inverse-trig.shader_test:

...

+[pixel shader] +uniform float4 a;

+float4 main() : sv_target +{

// Add 10 to avoid epsilon problems near 0

// Note that this means -8.5odd means -pi/2

Is this really needed, especially once you're using the right coefficients?

Please, also test what happens out of the definition domain of the function, including for infinities.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50602

Giovanni Mascellani (＠giomasce)

9:46 a.m.

Giovanni Mascellani (@giomasce) commented about tests/hlsl/inverse-trig.shader_test:

...

+[pixel shader] +uniform float4 a;

+float4 main() : sv_target +{

// Return things in terms of pi to make the numbers in the

// test cases more understandable.

// Also, because the argument order is (y,x) the numbers are

// passed in "backwards" here, so they're the right way in the

// test cases.

return float4(atan2(a.x, a.y), 0.0, 0.0, 0.0) / 3.14159265;

+}

+[test]

Please, also test more "random" values of `x` and `y`, including cases in which `x` and `y` are different in absolute value, cases in which either `x` or `y` is very small but not zero, or infinite, or very large but not infinite, or a negative zero. The point of using `atan2(y, x)` rather than `atan(y / x)` is to have a lot of corner cases dealt with automatically (see for example the `atan2` Linux man page), so I think we should test for that.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50603

Giovanni Mascellani (＠giomasce)

9:47 a.m.

Hi, sorry for the delay, but here is another batch of comments.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_50604

713

Age (days ago)

729

Last active (days ago)

wine-gitlab@winehq.org

13 comments

5 participants

tags (0)

participants (5)

Giovanni Mascellani (＠giomasce)
Nikolay Sivov (＠nsivov)
Petrichor Park
Petrichor Park (＠petrathekat)
Zebediah Figura (＠zfigura)