David Lee Lambert wrote:
On Wed, Jul 28, 2004 at 10:13:15PM -0700, Alexandre Julliard wrote:
"Dmitry Timoshkov" dmitry@baikal.ru writes:
I like the idea of moving that setting to the config file. We can't use existing unix locale settings except LC_ALL and LANG because every user's system might have (and does have) very different locale settings, we can't assume that everyone out there configures locale in the same way.
I don't see how the settings would be different, surely LC_CTYPE is always going to control the ASCII->Unicode mapping on Unix, so why shouldn't it do that on Wine? If the issue is that users change their setup without understanding the results, then surely adding even more config parameters that they need to get right is not going to improve the situation.
Actually, there are a number of different locale-related things that Wine needs to keep track of:
- ANSI->Unicode translation for programs that use the ANSI calls, as has been
discussed in this thread.
Ok.
- Unicode->codepage translation on standard output, and codepage->Unicode
translation on standard input. Note that I could set LANG to 'en_US.UTF-16' on my Linux system, and programs SHOULD accept this. Most don't, however.
Why should we set it differently than 1? In any case, I am not aware of UTF-16 being a compilable locale setting. Thus, it is not required that anyone support it.
- Unicode->codepage->Unicode translation on Linux kernels before 2.4, whereafter
filenames are SUPPOSED TO be in UTF-8, and kernel modules do translation for filesystems where filenames are stored in some other charset. (OPTIONAL, as filenames are not a big deal and the newer kernel fixes it--however, there has to be a converson from the short-per-character format to UTF-8).
Name one such filesystem, please. EXT and reiser never cared, as far as I know. VFAT has to translate names stored in UTF-16. Are you saying the kernels<2.4 didn't have the "iocharset" option?
- Selection of approriate language for strings in programs that use such
selection,
Discussed in this thread under the "GetDefaultUILanguage" API.
as well as time, numeric, and string formats.
Also discussed in this thread.
This is all through GetLocaleInfo(), whose first argument is an LCID returned by either GetUserDefaultLocale or GetSystemDefaultLocale.
You can also pass "LOCALE_SYSTEM_DEFAULT" instead, but that doesn't matter. In any case, there are "user overrides" here, which we may, one day, want to implement. Everything is laid out in the table that started this thread.
- The MultiByteToWideChar() and WideCharToMultiByte() functions, which allow a
program to do its own conversion to and from Unicode with a specified codepage.
What do we need to do with these? They get an explicit codepage to convert to/from. Funny though it may sound, these functions are not affected by locale.
I think (1) should be specified on a per-program basis in the config file, with a system default there, and, as final default, raw translation for ANSI-to-Unicode and something reasonable the other way. I said in another message that codepages are deprecated; I meant that the ANSI calls (as opposed to (5)) are deprecated for internationalized applications.
I don't agree. Mixing default codepages across simultaneously running programs is not possible on Windows, and sounds horribly difficult to implement. Clipboard handling and cross-file using are two examples of things that are likely to go horribly wrong if we tried.
Having one setting applicable to all running processes sounds good enough. I don't object to a config setting overriding what LC_CTYPE says, but I don't see a use for it either.
The '.codepage' suffix of LANG and LC_CTYPE should both be searched for the answer to (2). As for graphical output to X, it doesn't seem like that should be restricted by setting LANG.
Again - why should it be different than 1?
For (3) there should be an option in the config file like "filesystem_codepage", but it should default to utf8.
We should probably not bother, though. This "problem" is shared by every other Unix program running on the system, and solved the same way there - they use LC_CTYPE.
For (4), Wine should select an appropriate LangID and LCID based on the la_CC tag and return them, respectively, in response to Get*DefaultLangID and Get*LCID. In wine, at present, there is not really a seperate 'system' level.
Furthermore, wine could respond to different groups of GetLocaleInfo() constants according to LC_MESSAGES, LC_NUMERIC, etc., but this is an unusual feature that probably isn't needed at first. It seems that using the config-file to define codepage translation and the suffix for IO charset translation gets rid of the typical user's need to have other variables besides LANG set.
Consider locales I might use:
LANG LCID LangID
en_US 1 9 es_MX 52 10 es_US 1 9 ar_SA 966 1
Let's say I have a program that prints "Hello, World" in the current language, using wide calls. When I run it in Linux, it should print that string out using the current language and codepage. Suppose I also have a database program that was written in outer burgoslavia and keeps its data files in the encoding for outer burgoslavian, which is supported only by Windows 95 for Burgoslavia and Windows Server 2003. I don't want to change Linux to support Burgoslavian, but if Burgoslavian is encoded in some Unicode font I can add a section to [AppDefaults] and let that perticular program think it is running on an all-Burgoslavian system.
For (5), the functions act the same no matter what locale the user is in.