New subject: locales, unicode and ansi with msvcrt (bug 8022)

13 Apr 2007


      ...
...
...
What your test app is doing? It probably needs a test under Windows 
to see in which encoding (ANSI/OEM) a not unicode app should 
receive input via a pipe.
I meant things like 'dir >lst.txt', 'dir | sort > lst.txt'. 'dir' and 
'sort' could be replaced by some external .exes that get input and 
produce outpup.
Hiya,
I wrote an app which did ReadConsoleW and then traced out the hex of the
first character read in, and used ALT+157 as a mechanism to supply a
character which differs between the codepages I was playing with:
(All the following was under windows XP)
Code:
    ReadConsoleW(GetStdHandle(STD_INPUT_HANDLE), buf,
sizeof(buf)/sizeof(WCHAR), &nChars, NULL);
    printf("Character at position 0 is %x\n", buf[0]);
Results:
Active code page: 437 - Character at position 0 is a5
Active code page: 850 - Character at position 0 is d8
Active code page: 1252 - Character at position 0 is 9d
So I think its converting between the console codepage and Unicode, if I
interpret that correctly.
I then modified it to write out (**) unicode character 0xa5 to see if the
conversion is back to oem or ansi, and although its hard to prove beyond
doubt(*), it appears to me I am getting the reverse of that, it its
converted to the console codepage before being output..
(*) in cmd.exe if its not full screen, the font does not change when chcp is
executed, so for 437 and 850 I get an 0 type char and a yen. If I do it full
screen, both give me a yen, so I would concur from that the character
codepoint is changing and comes out depending on the font
(**) Because I want to test this with WriteConsoleW, this does not get
redirected to a file so I cant see the raw codepoints...
Anything else I can test, or am I ok to put file tests into msvcrt test
buckets and allow the msvcrt unicode printf and friends to convert to
non-unicode using the console codepage before being output to the file
handle?
Suggested tests welcome, but I was planning on using the unicode wide file
i/o functions, the opening in and confirming the bytes were as expected (If
I stick to a-z, 0-9 we will know if its come out with extra 0's)
Regards and thanks for your time,
Jason

Re: locales, unicode and ansi with msvcrt (bug 8022)