[PATCH 0/1] MR10804: Draft: unicode: optimize memicmp_strW by doing case-sensitive comparision first
Call to slow `to_lower` can be avoided, if characters already match case-sensitive. This cuts the time for wine boot in half `17.165 s` -> `8879.9 ms` Wine-bug: https://bugs.winehq.org/show_bug.cgi?id=59695 -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10804
From: Stephan Seitz <stephan.seitz@fau.de> Call to slow `to_lower` can be avoided, if characters already match case-sensitive. This cuts the time for wine boot in half `17.165 s` -> `8879.9 ms` Wine-bug: https://bugs.winehq.org/show_bug.cgi?id=59695 --- server/unicode.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/server/unicode.c b/server/unicode.c index bb39b55e50c..62b52191610 100644 --- a/server/unicode.c +++ b/server/unicode.c @@ -75,8 +75,13 @@ int memicmp_strW( const WCHAR *str1, const WCHAR *str2, data_size_t len ) { int ret = 0; - for (len /= sizeof(WCHAR); len; str1++, str2++, len--) + for (len /= sizeof(WCHAR); len; str1++, str2++, len--) { + // when chars match case-sensitive, we can avoid slow to_lower + if (*str1 == *str2) { + continue; + } if ((ret = to_lower(*str1) - to_lower(*str2))) break; + } return ret; } -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10804
Leaving this draft for now, since I assume this can be improved further. memicmp_strW is called in this bisect search (which seems to implement a hash table) https://gitlab.winehq.org/wine/wine/-/blob/master/server/registry.c?ref_type... -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10804#note_138642
Using memcmp seems to bring the time down to 6.6s (this seems to be the lower limit for memicmp_strW) -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10804#note_138643
I assume this can be improved further
I'd assume the same. I don't have your 492M chonker of a registry, so I can't test my theory, but I'd check the last array entry first in find_subkey, and if the given key is greater or equal, return that and skip the binary search. I think the registry is stored in sorted form, so that check will succeed every time during boot. This would give an extra string compare on actual insertions, but that's probably worth it. (I can think of a few other possible optimizations, but I think they'd pessimize the case of normal-sized registries, be annoyingly complex, and/or do nothing if the above is in place, so I'll omit them for now. No point posting walls of unnecessary text.) -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10804#note_138644
participants (3)
-
Alfred Agrell (@Alcaro) -
Stephan Seitz -
Stephan Seitz (@theHamsta)