On Friday, 11 November 2016 3:07 AM, Akihiro Sagawa wrote:
On Wed, 9 Nov 2016 11:26:05 +0000, Hugh McMaster wrote:
I couldn't type much in Japanese. Most of the keys gave me alphanumeric characters. When using the full-width character set, each character (including alphanumeric) had
a lot of space surrounding it. Half-width characters appeared normally. Yes, half-width characters and full-width characters coexist in Japanese character set.
Language question: can half-width and full-width characters appear together in the same sentence? I'm wondering how the console determines the width required? Does it use some kind of WINAPI call for the character set?
In computing, my answer is yes. Though this isn't a strict rule, most common usage is half-width alphanumeric and full-width katakana as seen in po/ja.po. Do you know Unicode Standard Annex #11, East Asian Width [1]? According to the document, Unicode character is classified into six: Ambiguous, Fullwidth, Halfwidth, Narrow, Wide, or Neutral (= Not East Asian). I'm not sure how native console determines the width required, but it seems that Ambiguous, Fullwidth and Wide characters require two cell buffers in Japanese locale on Windows 7. Unfortunately, there is no APIs to get this East Asian Width classification directly. However, GetStringType (C3_HALFWIDTH or C3_FULLWIDTH) partially helps us (again).
Thanks for adding that link to the Unicode annex. It was really useful.
I've begun looking into functions to determine character width. wcwidth() and wcswidth() (not in ISO C) are worth looking at.
StackOverflow [1] has some useful information, including an old C-based implementation of wcwidth and friends. It's worth testing on mixed width strings to see how useful it is. We may be able to update it to use the latest Unicode standard.
[1] http://stackoverflow.com/questions/3634627/how-to-know-the-preferred-display...