When a NAN is fed as input, return it as-is without mangling it through the calculation (which e.g. loses the sign of the input value).
This fixes a regression in a testcase of mine, after switching to the internal implementation of these functions.
Signed-off-by: Martin Storsjo martin@martin.st --- As these routines are imported from musl, I guess I should try to upstream the same fix to them too.
v2: Keep the math error reporting for msvcrt < 140 in tanh. --- dlls/msvcrt/math.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/dlls/msvcrt/math.c b/dlls/msvcrt/math.c index ecf5fdce4d8..9b69f6bea56 100644 --- a/dlls/msvcrt/math.c +++ b/dlls/msvcrt/math.c @@ -1139,6 +1139,9 @@ float CDECL coshf( float x ) UINT32 ui = *(UINT32*)&x; float t;
+ if (isnan(x)) + return x; + /* |x| */ ui &= 0x7fffffff; x = *(float*)&ui; @@ -1676,6 +1679,9 @@ float CDECL sinhf( float x ) UINT32 ui = *(UINT32*)&x; float t, h, absx;
+ if (isnan(x)) + return x; + h = 0.5; if (ui >> 31) h = -h; @@ -1885,9 +1891,11 @@ float CDECL tanhf( float x ) if (ui > 0x3f0c9f54) { /* |x| > log(3)/2 ~= 0.5493 or nan */ if (ui > 0x41200000) { -#if _MSVCR_VER < 140 if (isnan(x)) +#if _MSVCR_VER < 140 return math_error(_DOMAIN, "tanhf", x, 0, x); +#else + return x; #endif /* |x| > 10 */ fp_barrierf(x + 0x1p120f); @@ -2769,6 +2777,9 @@ double CDECL cosh( double x ) UINT32 w; double t;
+ if (isnan(x)) + return x; + /* |x| */ ux &= (uint64_t)-1 / 2; x = *(double*)&ux; @@ -4035,6 +4046,9 @@ double CDECL sinh( double x ) UINT32 w; double t, h, absx;
+ if (isnan(x)) + return x; + h = 0.5; if (ux >> 63) h = -h; @@ -4331,9 +4345,11 @@ double CDECL tanh( double x ) if (w > 0x3fe193ea) { /* |x| > log(3)/2 ~= 0.5493 or nan */ if (w > 0x40340000) { -#if _MSVCR_VER < 140 if (isnan(x)) +#if _MSVCR_VER < 140 return math_error(_DOMAIN, "tanh", x, 0, x); +#else + return x; #endif /* |x| > 20 or nan */ /* note: this branch avoids raising overflow */
On 7/16/21 7:05 AM, Martin Storsjo wrote:
When a NAN is fed as input, return it as-is without mangling it through the calculation (which e.g. loses the sign of the input value).
This fixes a regression in a testcase of mine, after switching to the internal implementation of these functions.
Maybe we could add such tests to Wine as well?
On Fri, 16 Jul 2021, Zebediah Figura (she/her) wrote:
On 7/16/21 7:05 AM, Martin Storsjo wrote:
When a NAN is fed as input, return it as-is without mangling it through the calculation (which e.g. loses the sign of the input value).
This fixes a regression in a testcase of mine, after switching to the internal implementation of these functions.
Maybe we could add such tests to Wine as well?
Yeah I guess that's best. I had set out hoping this would be trivial enough to work without, but it turned out to be quite a bit more fiddly than I first expected.
// Martin
On 16-07-2021 14:05, Martin Storsjo wrote:
@@ -4331,9 +4345,11 @@ double CDECL tanh( double x ) if (w > 0x3fe193ea) { /* |x| > log(3)/2 ~= 0.5493 or nan */ if (w > 0x40340000) { -#if _MSVCR_VER < 140 if (isnan(x)) +#if _MSVCR_VER < 140 return math_error(_DOMAIN, "tanh", x, 0, x); +#else
return x;
#endif /* |x| > 20 or nan */ /* note: this branch avoids raising overflow */
There are many comments about a value possibly being nan, such as the above one, that should probably all be updated.
On Fri, 16 Jul 2021, Sven Baars wrote:
On 16-07-2021 14:05, Martin Storsjo wrote:
@@ -4331,9 +4345,11 @@ double CDECL tanh( double x ) if (w > 0x3fe193ea) { /* |x| > log(3)/2 ~= 0.5493 or nan */ if (w > 0x40340000) { -#if _MSVCR_VER < 140 if (isnan(x)) +#if _MSVCR_VER < 140 return math_error(_DOMAIN, "tanh", x, 0, x); +#else
return x;
#endif /* |x| > 20 or nan */ /* note: this branch avoids raising overflow */
There are many comments about a value possibly being nan, such as the above one, that should probably all be updated.
Thanks, will do!
// Martin
Hi Martin,
On 7/16/21 2:05 PM, Martin Storsjo wrote:
@@ -1139,6 +1139,9 @@ float CDECL coshf( float x ) UINT32 ui = *(UINT32*)&x; float t;
- if (isnan(x))
return x;
A quick test shows that ucrtbase returns something like: if (isnan(x)) { *(DWORD*)&x |= 0x400000; return x; }
Probably it will make sense to move the code to integrate better with musl implementation. Instead of checking for nan with isnan function it can be done here: /* |x| > log(FLT_MAX) or nan */ if (ui > 0x7f800000) *(DWORD*)&t = *(DWORD*)&x | 0x400000; else t = __expo2f(x, 1.0f); return t;
Thanks, Piotr
On Fri, 16 Jul 2021, Piotr Caban wrote:
Hi Martin,
On 7/16/21 2:05 PM, Martin Storsjo wrote:
@@ -1139,6 +1139,9 @@ float CDECL coshf( float x ) UINT32 ui = *(UINT32*)&x; float t;
- if (isnan(x))
return x;
A quick test shows that ucrtbase returns something like: if (isnan(x)) { *(DWORD*)&x |= 0x400000; return x; }
Probably it will make sense to move the code to integrate better with musl implementation. Instead of checking for nan with isnan function it can be done here: /* |x| > log(FLT_MAX) or nan */ if (ui > 0x7f800000) *(DWORD*)&t = *(DWORD*)&x | 0x400000; else t = __expo2f(x, 1.0f); return t;
Hmm, yes, except all of these functions start out by masking out the sign bit of the input value, so if integrating the nan case at the end of the function, the sign bit is lost. (My own existing tests for nan preservation happen to be tests that only check whether the sign bit is preserved, nothing else.)
The extra explicit upfront isnan() tests at the top aren't exactly nice though... Or should we just OR in the original sign bit too, at this point?
// Martin
On 27 Jul 2021, at 00:01, Martin Storsjö martin@martin.st wrote:
On Fri, 16 Jul 2021, Piotr Caban wrote:
Hi Martin,
On 7/16/21 2:05 PM, Martin Storsjo wrote:
@@ -1139,6 +1139,9 @@ float CDECL coshf( float x ) UINT32 ui = *(UINT32*)&x; float t;
- if (isnan(x))
return x;
A quick test shows that ucrtbase returns something like: if (isnan(x)) { *(DWORD*)&x |= 0x400000; return x; }
Probably it will make sense to move the code to integrate better with musl implementation. Instead of checking for nan with isnan function it can be done here: /* |x| > log(FLT_MAX) or nan */ if (ui > 0x7f800000) *(DWORD*)&t = *(DWORD*)&x | 0x400000; else t = __expo2f(x, 1.0f); return t;
Hmm, yes, except all of these functions start out by masking out the sign bit of the input value, so if integrating the nan case at the end of the function, the sign bit is lost. (My own existing tests for nan preservation happen to be tests that only check whether the sign bit is preserved, nothing else.)
The extra explicit upfront isnan() tests at the top aren't exactly nice though... Or should we just OR in the original sign bit too, at this point?
If this is a general question, then yes, is some cases it might be needed to add sign bit. In this case it’s already preserved because x is used (not ui that has the sign bit cleared).
Thanks, Piotr
On Tue, 27 Jul 2021, Piotr Caban wrote:
On 27 Jul 2021, at 00:01, Martin Storsjö martin@martin.st wrote:
On Fri, 16 Jul 2021, Piotr Caban wrote:
Hi Martin,
On 7/16/21 2:05 PM, Martin Storsjo wrote:
Probably it will make sense to move the code to integrate better with musl implementation. Instead of checking for nan with isnan function it can be done here: /* |x| > log(FLT_MAX) or nan */ if (ui > 0x7f800000) *(DWORD*)&t = *(DWORD*)&x | 0x400000; else t = __expo2f(x, 1.0f); return t;
Hmm, yes, except all of these functions start out by masking out the sign bit of the input value, so if integrating the nan case at the end of the function, the sign bit is lost. (My own existing tests for nan preservation happen to be tests that only check whether the sign bit is preserved, nothing else.)
The extra explicit upfront isnan() tests at the top aren't exactly nice though... Or should we just OR in the original sign bit too, at this point?
If this is a general question, then yes, is some cases it might be needed to add sign bit. In this case it’s already preserved because x is used (not ui that has the sign bit cleared).
Yes, but ui, with the sign bit cleared, is also written back to x right after clearing the sign bit, with e.g. x = *(float*)&ui;
But we can store the sign bit separately and reapply it at the end. But I'll give it a shot at writing it that way.
// Martin