"Dmitry Timoshkov" dmitry@baikal.ru wrote:
thomas.mertes@t-mobile.at wrote:
RtlUpperChar does not take the locale into account. It always convertes just 'a' .. 'z' and nothing else.
Is this the right direction or did I miss something?
No, that's not right. We really need to convert to unicode first, then upper case unicode character and convert the result back to ansi. See the ReactOS source for reference (at the first glance it closely enough matches NTDLL asm code).
I have not seen the asm code. I just do tests on w2k (see my other mail). ReactOS is ok but not my measurement. These tests suggest that the right implementation would be:
CHAR WINAPI RtlUpperChar( CHAR ch ) { if (ch >= 'a' && ch <= 'z') { return ch - 'a' + 'A'; } else { return ch; } }
I was not able to find any other behaviour under w2k. RtlUpperString also just converts 'a' .. 'z' to 'A' .. 'Z'. In my opinion RtlUpperChar and RtlUpperString are low level functions which do not know anything about locales / codepages. But the ntdll (and msvcrt) functions toupper and _toupper should know about locales / codepages (may be they should use the solution below).
BTW: My other implementation did it the ReactOS way and is not a 'horrible kludge':
CHAR WINAPI RtlUpperChar( CHAR ch ) { WCHAR wch;
wch = toupperW(((WCHAR) ch) & 0xff); if (wch >> 8) { return ch; } else { return (CHAR) wch; } }
According to my information: When you do not have multibyte chars (as it is here), you just set the upper byte to 0. Just look close: ((WCHAR) ch) & 0xff is necessary because CHAR is signed and ((WCHAR) ch) would sign extend (which is wrong). I did a test with the 'horrible kludge' on wine and it worked fine (The ISO Latin 1 charaters got converted in the right way).
Greetings Thomas
thomas.mertes@t-mobile.at wrote:
RtlUpperChar does not take the locale into account. It always convertes just 'a' .. 'z' and nothing else.
Is this the right direction or did I miss something?
No, that's not right. We really need to convert to unicode first, then upper case unicode character and convert the result back to ansi. See the ReactOS source for reference (at the first glance it closely enough matches NTDLL asm code).
BTW the msdn page on this is http://msdn.microsoft.com/library/en-us/kmarch/hh/kmarch/k109_9x6a.asp
I have not seen the asm code. I just do tests on w2k (see my other mail).
Maybe someday we'll run into a situation (Devangari? :-) where we need to do the right Unicode thing. Until then, it seems reasonable to do the simplest possible code for the moment, with no calls to external toupper macros. - Dan