Le mar 25/01/2005 à 20:03, Bill Medland a écrit :
On January 25, 2005 03:48 pm, Robert Shearman wrote:
Vincent B�on wrote:
Le mar 25/01/2005 �11:51, Robert Shearman a �rit :
- Copyright 2001 Ove Kåven, TransGaming Technologies
Not sure we want UTF-8 in the source files... At least we don't have it (yet).
I think it should be made policy that anything in the comments should be UTF-8.
Can one specify that a region of file has a certain encoding?
Not in a text file. Not even sure you can in a (x)html file (for the different regions in a same file).
Surely what we are saying is that the source codes are utf-8
Are? Only dlls/kernel/lcformat.c contain easily spotted UTF-8 sequences in a source file. It (in Julio César Gázquez's name) comes from the old ole/ole2nls.c. Alexandre moved some functions to their current location, and while copying over the authors names, UTF-8'd them (they were in Latin-9 before, also known as ISO-8859-1 (or 15)). I don't know if it was willingly or a side effect of his editor (the change is dated May 22, 2003).
Should we keep it as is (and convert the other parts to UTF-8 at the same time), or should we change it back?
Various .rc files also contain UTF-8 sequences, almost all in Nl.rc files for e umlaut and i umlaut. I'm unable to make them print correctly when output by Wine (ie, I see the individual UTF-8 bytes, not the intended result). Hans, did you provide your translations in UTF-8 format by any chance? I think this should be corrected for that language (and others if others have the same issue), as .rc files are encoded in a language-specific pagecode, which is most likely not UTF-8.
I don't think the same can be done for code yet as gcc and other compilers probably don't support this?
Um! What part of the code would be other than the pure ascii portion of utf-8? The only bits I can think of are comments and literal strings. And certainly on my system I can put utf-8 in them.
"Ascii" strings in UTF-8? Or Wide-char strings (or chars) in UTF-8?
The first looks like non-sense, while I don't think the second would work out ok unless gcc understands UTF-8 (ie, is 'Ã¥' accepted by gcc as a single char representing the same thing as 'å' if LANG=*.UTF-8, or is it understood as "warning: multi-character character constant"? My current testing points me to the warning.).
For the comments, it's mainly an editor thing (some do support UTF-8, some don't). Of course, mail programs can also change the encoding while attaching or inlining patches...
Vincent