http://bugs.winehq.org/show_bug.cgi?id=23272
--- Comment #10 from Sagawa sagawa.aki+winebugs@gmail.com 2010-06-28 08:11:31 --- (In reply to comment #9)
(In reply to comment #7)
Created an attachment (id=29164)
--> (http://bugs.winehq.org/attachment.cgi?id=29164) [details] [details]
proposed patch for CJK word break rule 2
Why using GetStringTypeW() is not enough?
OK, there are three reasons.
1. Hangul Syllables Hangul Syllables characters (i.e. U+AC00-U+D7A3) are classified into C3_ALPHA, not C3_KATAKANA, C3_HIRAGANA, C3_IDEOGRAPH. And Thai alphabets (e.g.U+0E01) also have the same classification. In my knowledge, the former can be break in words, but the latter is not. But GetStringTypeW() just returns C3_ALPHA for both of them. Therefore, we cannot make a distinction between them with CT_CTYPE3 information.
2. CJK symbols In CJK codepage, some symbol characters are defined and used. For instance, following characters are used in Japanese text: - U+3005(IDEOGRAPHIC ITERATION MARK) - U+300F(RIGHT WHITE CORNER BRACKET) - U+3231(PARENTHESIZED IDEOGRAPH STOCK) - U+339E (SQUARE KM) But they aren't hiragana, katakana or ideographic characters as for GetStringTypeW(). They are C3_NOTAPPLICABLE or C3_SYMBOL character. Therefore, they have same problem. Not only them but also other non-CJK script characters have C3_NOTAPPLICABLE or C3_SYMBOL class. In my opinion, because above character is a part of CJK texts, they should word-wrap like ideographic characters (not western style).
3. Kinsoku shori In CJK text, it can break in words. However, there are some exceptions called Kinsoku shori (Japanese publishing term). In short, some characters are not allowed at the start of a line, and some of others are not allowed at the end of a line. For details, please refer Wikipedia (http://en.wikipedia.org/wiki/Kinsoku_shori) or W3C Working Group Note (http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#en-subheading2_1_7). These special characters are not determined by GetStringTypeW(), but it needs to format CJK texts properly. Thus, I added a table without the API for the basic implementation of Kinsoku shori.