Re: [PATCH] make_unicode: Change handling of Turkish i to match Windows
Daniel Lehman <dlehman(a)esri.com> wrote:
Windows does not lowercase the Turkish dotted i (0x130) as 'i' (0x69) It also does not uppercase the small dotless I (0x131) to 'I' (0x49)
Wine does the conversion for both, but Windows leaves them as-is
the change here is to make Wine mimic Windows for the Turkish i
Since you are modifying the win32 side of the unicode tables please add the correponding tests for LCMapString() to dlls/kernel32/tests/locale.c. It's quite possible that msvcrt doesn't use a win32 backend and has its own locale support. -- Dmitry.
On 30 September 2016 at 14:23, Dmitry Timoshkov <dmitry(a)baikal.ru> wrote:
Daniel Lehman <dlehman(a)esri.com> wrote:
Windows does not lowercase the Turkish dotted i (0x130) as 'i' (0x69) It also does not uppercase the small dotless I (0x131) to 'I' (0x49)
Wine does the conversion for both, but Windows leaves them as-is
the change here is to make Wine mimic Windows for the Turkish i
Since you are modifying the win32 side of the unicode tables please add the correponding tests for LCMapString() to dlls/kernel32/tests/locale.c. It's quite possible that msvcrt doesn't use a win32 backend and has its own locale support.
I could imagine LCMAP_LINGUISTIC_CASING making a difference here.
Since you are modifying the win32 side of the unicode tables please add the correponding tests for LCMapString() to dlls/kernel32/tests/locale.c. It's quite possible that msvcrt doesn't use a win32 backend and has its own locale support.
I could imagine LCMAP_LINGUISTIC_CASING making a difference here.
Windows' LCMapString behaves differently but only if LCMAP_LINGUISTIC_CASING is specified. if that flag is not specified, it works just like tolower/toupper Wine's LCMapString currently calls tolowerW and doesn't support the LCMAP_LINGUISTIC_CASING flag. But the current table contains the conversions the LCMAP_LINGUISTIC_CASING flag needs if added in the future Should I change the fix to something like one of the following? Or just add tests with todo_wine for now? I could have msvcrt tolower call LCMapString(LOWERCASE)? Then: - LCMapString(LOWERCASE|CASING) calls the current tolowerW with the conversion - LCMapString(LOWER) calls tolowerW except on certain characters (like Turkish i) - msvcrt tolower variants call LCMapString(LOWER) Or alternatively could there be new tolowerW function that doesn't do the conversion (maybe from a second table make_unicode produces)? Then: - LCMapString(LOWERCASE|CASING) call the current tolowerW with the conversion - LCMapString(LOWERCASE) and msvcrt tolower variants call new tolowerW without the conversion This patch was for the Turkish I but I have 1 other character with the same issue and I'm sure there are more Thanks daniel
On 30 September 2016 at 18:32, Daniel Lehman <dlehman(a)esri.com> wrote:
Wine's LCMapString currently calls tolowerW and doesn't support the LCMAP_LINGUISTIC_CASING flag. But the current table contains the conversions the LCMAP_LINGUISTIC_CASING flag needs if added in the future
Are you sure about that? If I understood the original patch description correctly, Wine currently has the following mappings: İ -> i i -> I I -> i ı -> I While the correct (Windows) mappings would be: with LCMAP_LINGUISTIC_CASING and tr or az locale: İ <-> i I <-> ı without LCMAP_LINGUISTIC_CASING: I <-> i İ -> İ ı -> ı I think doing this properly would need to take the data from http://www.unicode.org/Public/9.0.0/ucd/SpecialCasing.txt into account. http://www.unicode.org/reports/tr44/#Casemapping seems relevant.
Are you sure about that? If I understood the original patch description correctly, Wine currently has the following mappings: İ -> i i -> I
I -> i ı -> I
While the correct (Windows) mappings would be:
with LCMAP_LINGUISTIC_CASING and tr or az locale: İ <-> i I <-> ı
without LCMAP_LINGUISTIC_CASING: I <-> i İ -> İ ı -> ı
right
I think doing this properly would need to take the data from https://urldefense.proofpoint.com/v2/url?u=http-3A__www.unicode.org_Public_9.0.0_ucd_SpecialCasing.txt&d=DQIFaQ&c=n6-cguzQvX_tUIrZOS_4Og&r=0->v0AaA02WDVZTGQo77gZQ&m=ojKq6NWxBIZ_S4HX7WhgcDIxqKLoSwppiNx8q7fScrw&s=j6sJ7DQkj8fKnFLo3i4DPMNpiKklnG9qGITUzV9Nq4M&e= into account. https://urldefense.proofpoint.com/v2/url?u=http->3A__www.unicode.org_reports_tr44_-23Casemapping&d=DQIFaQ&c=n6-cguzQvX_tUIrZOS_4Og&r=0->v0AaA02WDVZTGQo77gZQ&m=ojKq6NWxBIZ_S4HX7WhgcDIxqKLoSwppiNx8q7fScrw&s=HlbjMHfc_H_7HQHRfjtzip3RyjOaoF0WSCQPmQNP8Zc&e= seems relevant.
I'll take a look Thanks daniel
participants (3)
-
Daniel Lehman -
Dmitry Timoshkov -
Henri Verbeet