On Wed, 4 Aug 2021, Francois Gouget wrote:
It turns out that the oleaut32 caching is not needed at all. I will send an updated patch.
The attached patch can be used to collect more information about run times in each case:
* If both the GetLocaleInfoW() and GetLocalisedNumberChars() caching are disabled (s/01/0/ in locale.c): GetLocalisedNumberChars(lcid=400,NOUSEROVERRIDE,cache=0) took 1422 ns GetLocalisedNumberChars(lcid=409,NOUSEROVERRIDE,cache=0) took 964 ns GetLocalisedNumberChars(lcid=40c,NOUSEROVERRIDE,cache=0) took 1012 ns GetLocalisedNumberChars(lcid=407,NOUSEROVERRIDE,cache=0) took 981 ns GetLocalisedNumberChars(lcid=400,0,cache=0) took 69000 ns GetLocalisedNumberChars(lcid=409,0,cache=0) took 1600 ns GetLocalisedNumberChars(lcid=40c,0,cache=0) took 68900 ns GetLocalisedNumberChars(lcid=407,0,cache=0) took 1600 ns
Given that my locale is fr_FR.UTF-8, only the LOCALE_USER_DEFAULT (400) and LANG_FRENCH (40c) cases go through the registry. And that's where it's slow without caching: ~70000 ns per call. All the other cases just query the resources and are 40+ times faster.
* If only the GetLocalisedNumberChars() caching is disabled: GetLocalisedNumberChars(lcid=400,NOUSEROVERRIDE,cache=0) took 1195 ns GetLocalisedNumberChars(lcid=409,NOUSEROVERRIDE,cache=0) took 820 ns GetLocalisedNumberChars(lcid=40c,NOUSEROVERRIDE,cache=0) took 809 ns GetLocalisedNumberChars(lcid=407,NOUSEROVERRIDE,cache=0) took 852 ns GetLocalisedNumberChars(lcid=400,0,cache=0) took 946 ns GetLocalisedNumberChars(lcid=409,0,cache=0) took 1156 ns GetLocalisedNumberChars(lcid=40c,0,cache=0) took 642 ns GetLocalisedNumberChars(lcid=407,0,cache=0) took 1157 ns
Only the 400 and 40c cases go through the GetLocaleInfoW() registry cache. These are slightly faster than querying the resources but not by much (~650-950 ns instead of ~1150 ns).
* And with GetLocalisedNumberChars() caching: GetLocalisedNumberChars(lcid=400,NOUSEROVERRIDE,cache=2) took 16 ns GetLocalisedNumberChars(lcid=409,NOUSEROVERRIDE,cache=2) took 16 ns GetLocalisedNumberChars(lcid=40c,NOUSEROVERRIDE,cache=2) took 32 ns GetLocalisedNumberChars(lcid=407,NOUSEROVERRIDE,cache=2) took 17 ns GetLocalisedNumberChars(lcid=400,0,cache=2) took 16 ns GetLocalisedNumberChars(lcid=409,0,cache=2) took 16 ns GetLocalisedNumberChars(lcid=40c,0,cache=2) took 16 ns GetLocalisedNumberChars(lcid=407,0,cache=2) took 16 ns
GetLocalisedNumberChars()'s caching is the fastest by far (>70x) since it's a simple memcpy().
So with the patch submitted in this thread we keep the high performance when querying anything but the registry locale settings. But as mentioned before it's not clear how often that would be used. Maybe it can benefit some applications that parse a lot of data that's not in the user's locale (maybe because it comes from some database or spreadsheet).
But if we consider GetLocaleInfoW()'s performance good enough then the GetLocalisedNumberChars() caching can be removed entirely.