[PATCH v2 0/1] MR10805: Draft: msvcrt: only convert _tolower_l if not exact match already
This optimizations brought some speed-ups in https://gitlab.winehq.org/wine/wine/-/merge_requests/10804 for the equivalent `memicmp_strW` function as it avoids converting and locale lookup for exact matches. -- v2: msvcrt: only convert _tolower_l if not exact match already https://gitlab.winehq.org/wine/wine/-/merge_requests/10805
From: Stephan Seitz <stephan.seitz@fau.de> This optimizations brought some speed-ups in https://gitlab.winehq.org/wine/wine/-/merge_requests/10804 for the equivalent `memicmp_strW` function. --- dlls/msvcrt/string.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/dlls/msvcrt/string.c b/dlls/msvcrt/string.c index 6b4ffbf45b5..2719ab37e9c 100644 --- a/dlls/msvcrt/string.c +++ b/dlls/msvcrt/string.c @@ -3435,8 +3435,11 @@ int __cdecl _memicmp_l(const void *v1, const void *v2, size_t len, _locale_t loc while (len--) { - if ((ret = _tolower_l(*s1, locale) - _tolower_l(*s2, locale))) - break; + if (*s1 != *s2) { + /* only convert _tolower_l if not exact match already */ + if ((ret = _tolower_l(*s1, locale) - _tolower_l(*s2, locale))) + break; + } s1++; s2++; } -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10805
If I wanted this function faster, my first idea would be changing _tolower_l (dlls/msvcrt/ctype.h) to ignore locinfo->pctype[c] & _UPPER in the c < 256 path, and always return locinfo->pclmap[c]. One memory access less if it is uppercase, and one branch less for all inputs. Could be combined with your proposed optimization, of course. But I would want numbers on how much impact those optimizations have. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10805#note_138663
participants (3)
-
Alfred Agrell (@Alcaro) -
Stephan Seitz -
Stephan Seitz (@theHamsta)