msvcrt:scanf float conversion behavior

List overview All Threads

newer

older

Bugs To Fix?

[PATCH] quartz/tests: Skip the...

Erich E. Hoover

23 Dec 2019 23 Dec '19

2:52 p.m.

I am trying to resolve places where "long double" is being used to compute the value of a floating point number and I am currently looking at resolving this issue in the scanf family of functions. I'm not the hugest fan of having duplicated code running around, so I'm wondering if anyone knows why scanf float behavior isn't implemented using [str|wcs]tod (or possibly the other way around). So, does anyone know why this is the case? Would a patch be accepted that consolidates these instances?

Best, Erich

Show replies by date

Jacek Caban

23 Dec 23 Dec

3:12 p.m.

Hi Erich,

On 12/23/19 3:52 PM, Erich E. Hoover wrote:

...

I am trying to resolve places where "long double" is being used to compute the value of a floating point number and I am currently looking at resolving this issue in the scanf family of functions. I'm not the hugest fan of having duplicated code running around, so I'm wondering if anyone knows why scanf float behavior isn't implemented using [str|wcs]tod (or possibly the other way around). So, does anyone know why this is the case? Would a patch be accepted that consolidates these instances?

I heard that Piotr was looking at this before leaving for vacations. I don't know how far he got. He will be back later this week and it may be good to coordinate with him.

Jacek

Piotr Caban

26 Dec 26 Dec

2:04 p.m.

Hi Erich,

On 12/23/19 3:52 PM, Erich E. Hoover wrote:

...

I am trying to resolve places where "long double" is being used to compute the value of a floating point number and I am currently looking at resolving this issue in the scanf family of functions. I'm not the hugest fan of having duplicated code running around, so I'm wondering if anyone knows why scanf float behavior isn't implemented using [str|wcs]tod (or possibly the other way around). So, does anyone know why this is the case? Would a patch be accepted that consolidates these instances?

I was working on different strtod implementation (ucrtbase (as well as glibc) uses sub 0.5 ulp precision algorithm). It's not ready yet. Also it's not something that can be added during code-freeze.

I was not working on scanf changes. It definitely makes sense to not duplicate the code.

Thanks, Piotr

Erich E. Hoover

27 Dec 27 Dec

3:41 a.m.

On Thu, Dec 26, 2019 at 7:05 AM Piotr Caban piotr.caban@gmail.com wrote:

...

... I was working on different strtod implementation (ucrtbase (as well as glibc) uses sub 0.5 ulp precision algorithm). It's not ready yet. Also it's not something that can be added during code-freeze.

Yeah, I was struggling to get something that passes this scanf test (without breaking the more numerous strtod tests): === ret = vsscanf_wrapper(tests[i], "1.1e-30", "%lf", &double_res); ok(ret == 1, "sscanf returned %d for flags %#x\n", ret, tests[i]); ok(double_res == 1.1e-30, "got wrong double %.16e for flags %#x\n", double_res, tests[i]); === I finally came up with the attached the other day, which seems to do the trick*. If you think AJ might be okay with it then would only leave webservices/reader.c:str_to_double with "long double" usage, which I now have a patch for implementing with scanf.

...

I was not working on scanf changes. It definitely makes sense to not duplicate the code.

I've been poking at this a little bit and it looks like scanf supports everything that strtod and wcstod need. There are a couple things that our scanf implementation doesn't support yet (inf/nan, Fortran-style exponent notation, and hex notation for floats), but once those are in place we should be good to go. So, it may make sense to centralize everything in scanf (rather than the other way around) so that there isn't any duplication for wide character support.

Best, Erich

* Also works well enough that the new compare_double should not be necessary, unless you've come up with some more test cases ;)

Greg

4:37 a.m.

On Thu, 2019-12-26 at 20:41 -0700, Erich E. Hoover wrote:

...

On Thu, Dec 26, 2019 at 7:05 AM Piotr Caban piotr.caban@gmail.com wrote:

...
...I was working on different strtod implementation (ucrtbase (as well asglibc) uses sub 0.5 ulp precision algorithm). It's not ready yet. Alsoit's not something that can be added during code-freeze.

Yeah, I was struggling to get something that passes this scanf test(without breaking the more numerous strtod tests):===ret = vsscanf_wrapper(tests[i], "1.1e-30", "%lf", &double_res);ok(ret == 1, "sscanf returned %d for flags %#x\n", ret, tests[i]);ok(double_res == 1.1e-30, "got wrong double %.16e for flags %#x\n",double_res, tests[i]);===I finally came up with the attached the other day, which seems to dothe trick*. If you think AJ might be okay with it then would onlyleave webservices/reader.c:str_to_double with "long double" usage,which I now have a patch for implementing with scanf.

...
I was not working on scanf changes. It definitely makes sense to notduplicate the code.

I've been poking at this a little bit and it looks like scanf supportseverything that strtod and wcstod need. There are a couple thingsthat our scanf implementation doesn't support yet (inf/nan,Fortran-style exponent notation, and hex notation for floats), butonce those are in place we should be good to go. So, it may makesense to centralize everything in scanf (rather than the other wayaround) so that there isn't any duplication for wide charactersupport. Best,Erich

Also works well enough that the new compare_double should not

benecessary, unless you've come up with some more test cases ;)

FYI, The existing NaN behaviour for the scanf family is leading to issues with a game. My report is here: https://bugs.winehq.org/show_bug.cgi?id=48323 Returning NaN may not help that issue, but at least might be a better to deal with than just not parsing.

Regards,

Greg

Piotr Caban

11:46 a.m.

On 12/27/19 4:41 AM, Erich E. Hoover wrote:

...

On Thu, Dec 26, 2019 at 7:05 AM Piotr Caban piotr.caban@gmail.com wrote:

...
... I was working on different strtod implementation (ucrtbase (as well as glibc) uses sub 0.5 ulp precision algorithm). It's not ready yet. Also it's not something that can be added during code-freeze.

Yeah, I was struggling to get something that passes this scanf test (without breaking the more numerous strtod tests): === ret = vsscanf_wrapper(tests[i], "1.1e-30", "%lf", &double_res); ok(ret == 1, "sscanf returned %d for flags %#x\n", ret, tests[i]); ok(double_res == 1.1e-30, "got wrong double %.16e for flags %#x\n", double_res, tests[i]); === I finally came up with the attached the other day, which seems to do the trick*. If you think AJ might be okay with it then would only leave webservices/reader.c:str_to_double with "long double" usage, which I now have a patch for implementing with scanf.

The 1.1e-30 tests is an example of a test that stopped working in strtod with your patches. While working on new implementation of strtod I was running Chromium string conversion tests (it's a 100k semi random doubles conversion test). Here are some results: ucrtbase: - Windows: 0 failures - Current 32-bit Wine: 32062 failures, biggest error: 3 ulp - 6e7f357 32-bit Wine: 36 failures, biggest error: 1 ulp msvcrt: - Windows 10: 16 failures

ucrtbase scanf tests: - Windows: 0 failures - Current 32-bit Wine: 36 failures, biggest error: 1 ulp - Current 32-bit Wine with the patch you have attached: 49779 failures, the exponent is sometimes off by 1, e.g. on "1.5475148155234508e-252"

The test that produces the biggest error for strtod function is: 4.0621786324484882e-053

I'm afraid that your patches will possibly cause tons of regressions and I'm not sure yet how to proceed with it during the code-freeze. There's a quite simple algorithm that should have similar results as msvcrt (I will need to implement it in order to know) that can possibly go into Wine during code-freeze. On the other hand it will need to be rewritten after the code-freeze so I'm not a fun of it. The proper algorithm I was working on will definitely exceed 1k lines of code.

...

...
I was not working on scanf changes. It definitely makes sense to not duplicate the code.

I've been poking at this a little bit and it looks like scanf supports everything that strtod and wcstod need. There are a couple things that our scanf implementation doesn't support yet (inf/nan, Fortran-style exponent notation, and hex notation for floats), but once those are in place we should be good to go. So, it may make sense to centralize everything in scanf (rather than the other way around) so that there isn't any duplication for wide character support.

I was thinking about having a common helper that will be used both in strtod and scanf. Proper string to double conversion is complicated and I don't think it should be mixed with scanf implementation.

Changing the strtod_helper signature to something like: double strtod_helper(MSVCRT_wchar_t get(void *ctx), void unget(MSVCRT_wchar_t c, void *ctx), void *ctx, MSVCRT__locale_t locale, int *err) should make it usable both for strtod like functions and scanf.

...

Also works well enough that the new compare_double should not be

necessary, unless you've come up with some more test cases ;)

It will be still needed in string.c:3074. The test is written in a way that exact comparison will not work on Windows. It would probably also make sense to add most conversion tests to ucrtbase because other versions have bugs that we probably don't want to reproduce.

Thanks, Piotr

Erich E. Hoover

5:13 p.m.

On Fri, Dec 27, 2019 at 4:48 AM Piotr Caban piotr.caban@gmail.com wrote:

...

On 12/27/19 4:41 AM, Erich E. Hoover wrote:

...
On Thu, Dec 26, 2019 at 7:05 AM Piotr Caban piotr.caban@gmail.com

wrote:

...

...
...
... I was working on different strtod implementation (ucrtbase (as well as glibc) uses sub 0.5 ulp precision algorithm). It's not ready yet. Also it's not something that can be added during code-freeze.

Yeah, my approach was to try and get something that can cope with the 53-bit "long double" problem on x86-64 until after code freeze. AJ wasn't particularly thrilled with adjusting the x87 FPU settings as a workaround.

...

... The 1.1e-30 tests is an example of a test that stopped working in strtod with your patches.

Since this test wasn't in strtod I didn't catch it until trying to fix scanf.

...

While working on new implementation of strtod I was running Chromium string conversion tests (it's a 100k semi random doubles conversion test). Here are some results: ...

I'd be curious to see what this looks like with my new approach, once I understand why you are getting different test results from me.

...

ucrtbase scanf tests:

Windows: 0 failures

Current 32-bit Wine: 36 failures, biggest error: 1 ulp

Current 32-bit Wine with the patch you have attached: 49779

failures, the exponent is sometimes off by 1, e.g. on "1.5475148155234508e-252"

Just to confirm, you're running the current tests for dlls/ucrtbase/tests/scanf.c on 32-bit x86? The reason that I ask is that I'm getting 0 failures for that (both with and without the patch) and 0 ulp for 1.5475148155234508e-252.

...

The test that produces the biggest error for strtod function is: 4.0621786324484882e-053

I don't see this test, so I popped it in really quick and I'm getting 1 ulp with the patch (4 without).

...

I'm afraid that your patches will possibly cause tons of regressions and I'm not sure yet how to proceed with it during the code-freeze.

I think we need to figure out why we're getting different results on 32-bit. I've been doing most of my testing on 64-bit (since that's what broke horribly with the _control87 update), but the version I attached should pass all the current tests on both (whether applied to strtod or scanf).

...

There's a quite simple algorithm that should have similar results as msvcrt (I will need to implement it in order to know) that can possibly go into Wine during code-freeze. On the other hand it will need to be rewritten after the code-freeze so I'm not a fun of it. The proper algorithm I was working on will definitely exceed 1k lines of code.

Yikes, I figured a properly compliant implementation would be big - but not that big.

...

... I was thinking about having a common helper that will be used both in strtod and scanf. Proper string to double conversion is complicated and I don't think it should be mixed with scanf implementation. ...

Makes sense to me.

Best, Erich

Piotr Caban

5:51 p.m.

On 12/27/19 6:13 PM, Erich E. Hoover wrote:

...

I don't see this test, so I popped it in really quick and I'm getting 1 ulp with the patch (4 without).

I've copied part of the test strings only, sorry for that: 4.0621786324484881721115322e-53 (this one has problems in current wine) 1.5475148155234508405655166e-252 (this one has problems with the scanf patch you have attached)

I'm currently in the middle of testing a different approach to solving that so I can't easily run the tests now. For the "4.06..." test I'm getting following failure: got 4.0621786324484868e-053, expected 4.0621786324484882e-053 (diff -3) where the first number is what Wine's strtod implementation returned and second one is how compiler parses the number during compilation.

Thanks, Piotr

2018

Age (days ago)

2022

Last active (days ago)

wine-devel@winehq.org

7 comments

4 participants

tags (0)

participants (4)

Erich E. Hoover
Greg
Jacek Caban
Piotr Caban