On 12/06/2012 8:59 PM, John Emmas wrote:
Thanks Hin-Tak and Dan but I think we're at crossed purposes now. Remember that my original question had nothing to do with paths. I simply used paths as a convenient example. My question is about command-line parameters and (more specifically) about UTF-8 string conversion. Here's an example.... Consider a Windows user whose name is Göran. The UTF-8 byte sequence for this is:-
47 C3 B6 72 61 6E -- (6 bytes)
whereas Windows would expect something like this (depending on the user's locale):-
47 F6 72 61 6E -- (5 bytes)
Let's suppose that a Linux app launches a Windows child process (via Wine). The (Linux) host app needs to pass the string "Göran" as one of the command-line parameters. Linux uses UTF-8 and will therefore pass the first sequence of bytes to Wine (6 bytes). But Windows doesn't understand UTF-8. A Windows app would expect the second byte sequence (5 bytes - or 10 bytes for a Unicode app).
By "Linux uses UTF-8" you're saying that you have a UTF-8 locale active.
Does Wine carry out the necessary conversion or does it simply pass the original byte string unmodified? That's what I'm trying to find out. Thanks.
Wine first converts the command-line to Unicode according to the active host locale.
Then any character set conversion from Unicode to ANSI or vice versa within the application is done according to the active Windows locale.
If the Wine process needs to execute a native process, Wine converts the command line from Unicode using the active host locale.
The default character set under Linux is ISO-8859-1, while the default character set under Windows is Windows-1252, which has all of the printable characters in the same places as in ISO-8859-1, with some extra printable characters in the C1 (0x80..0x9F) area.
If no host locale is specified (i.e. the locale is POSIX), and the default Windows locale is used, then the only available characters an ANSI Windows program will be unable to be passed are the extra characters in the C1 area. All unrepresentable characters are replaced with question marks [?].