Hi Jason,
While preparing tests I found the 'user' backend of wineconsole works with WriteConsole/WriteFile correct (see test1 and test2), so I used it as etal.
I figured out the main issue with cmd: It passes strings to WriteFile in ANSI, but should in OEM.
I think the main locale issue is found. CMD and XCOPY violate this. So it is not MSVCRT bug. Moreover, [w]fprintf MUST NOT perform any AnsiToOem convertions (as test3 and test4 show).
I made and attached a quick hack to demostrate that cmd was buggy. Attached screenshots show the difference. The patch heals almost everything, even localized filenames. So I'm CC'ing mail to wine-devel, console/cmd gurus must know much more ;-)
I found some issues with wprintf, see README for test3 and test4. The attached tests are made with your patch applied.
What about running cmd without console, I think it is not a good idea - we must check locale and enconding, thus we must perform various convertions: to utf8, to cp1251, to koi8-r... Or just we must set correct value of console_[input|output]_codepage variables? Does windows console supports UTF8 as output codepage?
Given I don't know russian, is there any chance of writing me a short program which uses _wfopen, fwprintf, fclose to create me a file with a name and contents which are russian text, so I can test some of the cmd.exe issues reported with NLS chars.
Could I also have:
- Output from 'chcp' in a windows command prompt
c:>chcp "Текущая кодовая страница: 866" ["Current codepage: 866"]
- screen shot of 'dir' in a windows command prompt after running the
program so I know what characters I am expecting 3. Details of the LANG env var, and details of the xterm (I run Mandrake, and have Settings->Font and Settings->Encoding options which seem like the way to tailor what appears
LANG=ru_RU.KOI8-R xterm -fn koi10x20
Hopefully I can take the program and run it under wine, and see what filename it creates, then work out why DIR doesn't display it correctly. I am almost certain its because the wine cmd.exe program is converting in the wrong codepage somehow. I want to do an exercise of unicoding wine's cmd once I finish work on the commands I have left to do (mostly copy, for and attrib)
I suspect your winegcc issues are use of the L"unicode string" from my example at a guess, but to be honest I don't know
I've already figured out it: I should pass option "-mno-cygwin" to winegcc. It makes winegcc link against msvcrt.dll instead of glibc.
Thanks a lot for working upon this bug! -- Kirill
Hiya,
Firstly, fantastic work, and its also explained to me something Eric said which I didn't grasp...
While preparing tests I found the 'user' backend of wineconsole works with WriteConsole/WriteFile correct (see test1 and test2), so I used it as etal. I figured out the main issue with cmd: It passes strings to WriteFile in ANSI, but should in OEM. I think the main locale issue is found. CMD and XCOPY violate this. So it is not MSVCRT bug. Moreover, [w]fprintf MUST NOT perform any
AnsiToOem
convertions (as test3 and test4 show).
I made and attached a quick hack to demostrate that cmd was buggy. Attached
screenshots show the difference. The patch heals almost everything, even localized filenames. So I'm CC'ing mail to wine-devel, console/cmd gurus must know much more ;-)
I found some issues with wprintf, see README for test3 and test4. The attached tests are made with your patch applied.
Another interesting url is here, which confirms what you are saying http://smallcode.weblogs.us/2006/10/25/code-page-for-win32-console-programs/
So, another round of discussion but things have moved forward a lot...
1. The underlying problem appears to be that the output from both programs (and any other command line program) should not be done through msvcrt's i/o functions or writeconsoleA/writefileA, it needs to be either converted to OEM and then printf/writefile/writeconsole'd.
Ideally, a Unicode string would work better, which should be writeconsolew'd / writefile'd if that fails (eg if output is redirected to a file). => The only problem with this I can see is that this would result in redirected output containing Unicode which is wrong. However, advice I found on a URL on the web said this:
>>>>>>>>>>
Tips and considerations:
- use WriteConsole to output Unicode strings. Note that this API works
only on console handles and can not be used for a redirection to a disk file.
- If the output is being redirected to a disk file, use WriteFile with
the current console code page that can be retrieved by GetConsoleOutputCP (the console code page might be different from the currently selected OEM code page!).
>>>>>>>>>>
So I believe the output function in cmd.exe should end up (When fully unicoded):
writeconsoleW if this fails Convert from wide to multibyte using consoleoutputpt Writefile the result endif
Temporarily, since in cmd.exe we have an ANSI string in our hands, use something like the mechanism you have coded using chartooem. Out of interest since the string is in ansi, the msdn says CharToOEM(A) can convert in place, so if you put CharToOem(message, message); just before the WriteFiles (and remove the const qualifier), does this work?
2. The testcases I have previously confirm that msvcrt's functions are also misfunctioning, and I strongly believe my current solution to those is correct. Ie for applications using msvcrt wprintf functionality it needs to take into account the mode the file was opened. (I don't know which other msvcrt routines have similar impacts, but if this is accepted I'll try to take a look) => Unless I get any negative comments soon I will tidy the tests up and submit that as a patch
3. The right solution is that cmd.exe works in Unicode, which is an exercise I plan to do as soon as I have finished work on the few remaining issues I want to address (I want to look at attrib, for and copy, plus a few bugs I have written on scraps of paper...)
4. xcopy needs a similar fix - If you are happy to do some tests (especially xcopying files with russian names, plus copying directories created with Russian names) I'll contact you directly with a patch to test for me
5. Once all the above is done, I'd like to check on your test3/4 cases to see if there's any residual problems.
Again, thanks for your excellent work. I never thought I'd be so pleased to see Russian characters on a screen...!
Jason