an optimization for _wcsicmp_l
it's meant to be used internally, but i added tests to try to match windows behavior
24a2b625545f1875b5c3177f2b9da1b7299b864f degraded performance. perf showed most time was spent in get_language_sort. _wcsicmp_l is being used heavily for std::map/set
_wcsicmp_l calls _towlower_l for each letter of both strings, which in turn calls:
towlower_l -> LCMapStringW -> LCMapStringEx -> get_language_sort
the sortguid is the same since the same locale is passed. using SORTHANDLE gets the performance roughly where it was
here are all 5 commits, if you want to see where it's going: https://gitlab.winehq.org/dlehman25/wine/-/tree/wcsicmp4