The Chinese resource files are already utf-8, but I suspect lots of other files are in obscure character sets, which complicates patch processing and display.
Just how silly would it be for us to bite the bullet and set all source files to utf-8? We'd need to recode a bunch of files once, but after that, there'd be less confusion about how to view and edit various resource files.
Not saying we should up and do it, just throwing out an idea for discussion...
Dan Kegel wrote:
The Chinese resource files are already utf-8,
Romanian was always utf-8. Slovenian seems to be on a quick glance utf-8 only too.
German, French and Spanish have a few files in utf-8 too.
but I suspect lots of other files are in obscure character sets, which complicates patch processing and display.
Just how silly would it be for us to bite the bullet and set all source files to utf-8? We'd need to recode a bunch of files once, but after that, there'd be less confusion about how to view and edit various resource files.
Preferably that should be done by a translator of the language. There might be some oddities that prevent a full automatic recoding: e.g. Romanian uses latin-2 and that one has only a "t with cedilla" which exists in utf-8 too but utf-8 has the more correct "t with comma below accent".
Not saying we should up and do it, just throwing out an idea for discussion...
We should encourage the translators to move to utf-8.
bye michael
On Tue, Sep 2, 2008 at 3:43 AM, Michael Stefaniuc mstefani@redhat.com wrote:
Dan Kegel wrote:
[Let's convert the whole wine source tree to utf-8. Some of it already is, but I suspect lots of other files are in obscure character sets, which complicates patch processing and display. Then there'd be less confusion about how to view and edit various resource files.]
Preferably that should be done by a translator of the language. There might be some oddities that prevent a full automatic recoding: e.g. Romanian uses latin-2 and that one has only a "t with cedilla" which exists in utf-8 too but utf-8 has the more correct "t with comma below accent". ... We should encourage the translators to move to utf-8.
Nogo just converted lots of Japanese resources, which is great. There are probably upwards of 500 files left to convert, though. At some point, after we've been asking translators to move to utf-8 for a while, we should probably do a mop-up pass and do automatic recoding. Translators will then pop out of the woodwork to fix any mistakes that causes :-) - Dan
On Aug 29, 2008, at 11:10 PM, Dan Kegel wrote:
Just how silly would it be for us to bite the bullet and set all source files to utf-8? We'd need to recode a bunch of files once, but after that, there'd be less confusion about how to view and edit various resource files.
One area that I know will be a problem is in winex11.drv/keyboard.c. The keyboard tables there are expected to be in the encoding that will be in effect at runtime. That is, the bytes need to equal to what will be returned by XkbTranslateKeySym and X[mb]LookupString.
So, that file is a mishmash of multiple encodings. Each table is in a different encoding, the one most likely to be used with that keyboard. I guess we could replace non-ASCII characters in the tables with escaped hex codes, although that makes it more difficult to edit (at least for folks using the encoding in question). We should probably comment each table with the encoding being used.
-Ken