Not as nasty as when you find out that when you use CString, the same thing happens there. The MSDN used to say something along the lines of "CString fully supports MBCS", with a footnote stating that all the problems that happen when you work with strings still happen, and you take care of them the same way.
I wound up making a wrapper to stl string class, using the same interface. I did this by using a basic_string<japanese_char>, where japanese_char was a class I wrote. It allocated two bytes for storage, and would collect two bytes if isleadbyte returned true, and one if false. This allowed the class to provide a random access iterator into the string (as opposed to a forward iterator, which is all you can afford using the usual MBCS).
The entire thing makes you appretiate UTF-8. Not only does it provide a bidrectional iterator, but if you are only parsing for ASCII characters, you can completely ignore the fact that it's a UTF-8 string, and parse it as if it were an ASCII string.
Shachar
Andriy Palamarchuk wrote:
Shachar, thank you for the detailed response.
MSDN says that message returns number of TCHARS. It looks like two-byte MBCS character has 2 TCHARS. Nasty staff :-(
Andriy
Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com