On Dec 8, 2009, at 6:39 AM, Joerg-Cyril.Hoehle@t-systems.com <Joerg-Cyril.Hoehle@t-systems.com
wrote:
Ken Thomases wrote:
This results in the order I previously described: LC_ALL LC_* from the original environment Mac OS X settings LANG
In other words, LANG is completely ignored, since there presumably won't be a Mac without settings.
Correct. LANG is unreliable because Terminal.app can set it without the user's knowledge to a value that doesn't reflect the user's preferences.
I thought that was current behaviour.
Nope. Currently LANG, if set, supersedes the Mac OS X settings, except for LC_MESSAGES which is treated oddly.
The only point of dissension is about LANG. What you propose is closer to POSIX than what's current, so it's still progress, even though it both violates POSIX and deviates from Wine's behaviour on Linux.
It deviates from LINUX because LINUX doesn't have any mechanism (as far as I know) other than LC_ALL, LC_*, and LANG to express the user's preferences. Mac OS X does. I have argued (probably past the point of anybody caring) that it makes sense to consider the Mac OS X settings as though they were stored in the LC_* variables (with the user able to override them by manually setting LC_* or LC_ALL). If you grant this assumption, then the behavior does follow what Wine does on Linux. That is, if a Linux user has LC_* variables set to express their preferences, then those take precedence over LANG.
Hmm, thinking about your pseudo-code (not quoted here), I'm not sure that implements the "LC_ALL takes priority over LC_xyz" correctly, does it?
My pseudocode only sets environment variables (if they aren't already set). The precedence is implemented by the C library in how it handles the various environment variables during setlocale(LC_ALL, ""). If LC_ALL is set in the environment, then it takes precedence over the LC_* and LANG variables, in which case the stuff we do to translate the Mac OS X settings to the LC_* variables is ignored (appropriately). Then, Wine takes its cues from the C library.
Also, it omits the mapping that Bruno Haible hinted at: "Note that these [MacOS X] settings are similar but not entirely equal to Unix (glibc) conventions (e.g. "zh-Hans" vs. "zh_CN"), therefore some mapping of names has to be done."
"zh-Hans" doesn't specify a country, so it would actually map to just "zh". (Which, by the way, the Mac C library rejects because it doesn't have a proper locale definition in /usr/share/locale. So, even if we performed the mapping, it wouldn't buy us anything in this case.)
In my testing, if you set your Mac OS X formats region to one of China, Taiwan, or Hong Kong, then CFLocaleGetIdentifier( CFLocaleCopyCurrent() ) produces a value that the C library would understand: zh_CN, zh_TW, or zh_HK. If you customize your formats, it may add stuff like @currency=USD to the end of that, which the C library chokes on, so we have to strip it off. (The current Wine code already does this.)
Admittedly, you can probably get different results in a few ways. For example, I queried the available locale identifiers with CFLocaleCopyAvailableLocaleIdentifiers(), created a CFLocaleRef from each locale ID, and then probed those. That list contains some Chinese locales without a country, and CFLocaleGetIdentifier() on one of those CFLocaleRefs can produce results like "zh-Hans". I'm attaching the output of my locale dumping program.
It may be safest to not use CFLocaleGetIdentifier(), but rather format our own locale string by doing, effectively:
CFLocaleRef locale = CFLocaleCopyCurrent(); CFStringRef country = CFLocaleGetValue(locale, kCFLocaleCountryCode); CFStringRef localeString; if (country) localeString = CFStringCreateWithFormat(NULL, NULL, @"%s_%s.UTF-8", CFLocaleGetValue(locale, kCFLocaleLanguageCode), country); else localeString = CFStringCreateWithFormat(NULL, NULL, @"%s.UTF-8", CFLocaleGetValue(locale, kCFLocaleLanguageCode));
This avoids the need for the stripping of modifiers that the C library doesn't handle. It also takes care of specifying ".UTF-8" in a more straightforward manner than the current Wine code. Unfortunately, it throws away a lot of available information, but that's a consequence of using the C library as a middleman between Mac OS X and the Win32 world. In the long run, it would be better for Wine to directly use the Mac APIs rather than relying on the C library.
A patch to implement this approach (the one still relying on the C library) will be forthcoming.
-Ken