Re: [PATCH 0/3] MR364: vkd3d-shader/hlsl: Implement inverse trigonometry.

25 Sep 2023


      Giovanni Mascellani (@giomasce) commented about libs/vkd3d-shader/hlsl.y:
...
+        const struct parse_initializer *params, const struct vkd3d_shader_location *loc, bool asin_mode)
+{
+    struct hlsl_ir_function_decl *func;
+    struct hlsl_type *type;
+    char *body;
+
+    static const char template[] =
+            "%s %s(%s x)\n"
+            "{\n"
+            "    %s abs_arg = abs(x);\n"
+            "    %s correction = sqrt(1.0 - abs_arg);\n"
+            "    %s result = correction * (\n"
+            "        (3.14159265 / 2.0)\n"
+            "        - (0.2127403136003234 * abs_arg)\n"
+            "        + (0.07612092595257536 * abs_arg * abs_arg)\n"
+            "        - (0.01996337677405357 * abs_arg * abs_arg * abs_arg)\n"
Notice that native, at least in my tests, is evaluating the polynomial using [Horner's method](https://en.wikipedia.org/wiki/Horner's_method), which is probably more efficient (it takes only three multiplications instead of six in this case, if I'm not mistaken). That would amount to something like `(((-0.01996337677405357f * abs_arg + 0.07612092595257536f) * abs_arg - 0.2127403136003234f) * abs_arg + 1.570796325f`.
Notice that floating point numbers do not need so many significant digits ([see this nice tool](https://evanw.github.io/float-toy/)), and that native coefficients are slightly different from yours, not sure why. Where do your coefficients come from? Your tests have a rather large error margin set: while this is not a problem in its own, maybe using the same coefficients as native you could get that smaller.

-- 
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/364#note_46616

Re: [PATCH 0/3] MR364: vkd3d-shader/hlsl: Implement inverse trigonometry.

Giovanni Mascellani (＠giomasce)