On Mon Sep 25 11:23:49 2023 +0000, Giovanni Mascellani wrote:
Notice that native, at least in my tests, is evaluating the polynomial using [Horner's method](https://en.wikipedia.org/wiki/Horner%27s_method), which is probably more efficient (it takes only three multiplications instead of six in this case, if I'm not mistaken). That would amount to something like `(((-0.01996337677405357f * abs_arg + 0.07612092595257536f) * abs_arg - 0.2127403136003234f) * abs_arg + 1.570796325f`. Notice that floating point numbers do not need so many significant digits ([see this nice tool](https://evanw.github.io/float-toy/)), and that native coefficients are slightly different from yours, not sure why. Where do your coefficients come from? Your tests have a rather large error margin set: while this is not a problem in its own, maybe using the same coefficients as native you could get that smaller.
Evan Tang wrote the algorithms for all these. He solved for the coefficients using Wolfram Alpha because he felt a little worried about using Microsoft's output directly. Apparently, Evan's numbers are slightly less accurate overall, but don't yield a discontinuity at X=0.
It doesn't look like he has an account here but we could pull him into the discussion on Matrix?