Steven Edwards wrote:
Documented here:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/setupapi/se... http://msdn.microsoft.com/library/default.asp?url=/library/en-us/setupapi/se...
Changelog: Hervé Poussineau (hpoussin@reactos.com) Implement SetupGetInfFileListW and SetupGetInfInformationW Inf file parser now accept UNICODE files with FF FE header
Index: dlls/setupapi/parser.c
RCS file: /home/wine/wine/dlls/setupapi/parser.c,v retrieving revision 1.20 diff -u -r1.20 parser.c --- dlls/setupapi/parser.c 12 Sep 2005 22:07:53 -0000 1.20 +++ dlls/setupapi/parser.c 24 Sep 2005 06:52:23 -0000
...
@@ -951,12 +956,20 @@ WCHAR *new_buff = HeapAlloc( GetProcessHeap(), 0, size * sizeof(WCHAR) ); if (new_buff) {
DWORD len = MultiByteToWideChar( CP_ACP, 0, buffer, size, new_buff, size );
DWORD len = MultiByteToWideChar( CP_ACP, 0, buffer, size, new_buff,
}size * sizeof(WCHAR) ); err = parse_buffer( file, new_buff, new_buff + len, error_line ); HeapFree( GetProcessHeap(), 0, new_buff ); }
- else err = parse_buffer( file, buffer, (WCHAR *)((char *)buffer + size), error_line );
- else
- {
WCHAR *new_buff = (WCHAR *)buffer;
/* Some UNICODE files may start with the UNICODE marker */
if (*new_buff == 0xfeff)
new_buff++;
err = parse_buffer( file, new_buff, (WCHAR *)((char *)new_buff + size), error_line );
- }
Any time a Zero-Width Non-breaking Space is specifically skipped like this it is a bug. You should *never* have to do this if you use the right functions (isspaceW, etc). Is there some other bug lurking that made this change necessary?
Hi Rob,
On 9/24/05, Robert Shearman rob@codeweavers.com wrote:
- else err = parse_buffer( file, buffer, (WCHAR *)((char *)buffer + size), error_line );
- else
- {
WCHAR *new_buff = (WCHAR *)buffer;
/* Some UNICODE files may start with the UNICODE marker */
if (*new_buff == 0xfeff)
new_buff++;
err = parse_buffer( file, new_buff, (WCHAR *)((char *)new_buff + size), error_line );
- }
Any time a Zero-Width Non-breaking Space is specifically skipped like this it is a bug. You should *never* have to do this if you use the right functions (isspaceW, etc). Is there some other bug lurking that made this change necessary?
According to Hervé the *.inf files in the windows/system32/inf folder had some junk chars in the Unicode header or something like that and it causes the need for the skipping. He was worried that isspaceW might not handle the junk properly. There is a question about the sizeof WCHAR usage so I will check with him again and send along another patch later in the week. If Alexandre does not like the check I can also ask him to test it using isspaceW.
Thanks Steven
Steven Edwards wrote:
Hi Rob,
On 9/24/05, Robert Shearman rob@codeweavers.com wrote:
Any time a Zero-Width Non-breaking Space is specifically skipped like this it is a bug. You should *never* have to do this if you use the right functions (isspaceW, etc). Is there some other bug lurking that made this change necessary?
According to Hervé the *.inf files in the windows/system32/inf folder had some junk chars in the Unicode header or something like that and it causes the need for the skipping.
The 'junk' is a valid Unicode character that I describe above.
He was worried that isspaceW might not handle the junk properly.
#include <stdio.h> #include "wine/unicode.h"
int main(int argc, char *argv[]) { int space; space = isspaceW(0xfeff); printf("isspaceW(0xfeff) = %s\n", space ? "yes" : "no"); space = isspaceW('a'); printf("isspaceW('a') = %s\n", space ? "yes" : "no"); return 0; }
Output: isspaceW(0xfeff) = yes isspaceW('a') = no
There is a question about the sizeof WCHAR usage so I will check with him again and send along another patch later in the week.
Yes, that is another suspect change. If you send the patch again without those two changes it will be perfectly acceptable.
If Alexandre does not like the check I can also ask him to test it using isspaceW.
Sure. It is possible that the ReactOS implementation is isspaceW is broken for some reason, so I encourage Hervé to use the above simple test program to make sure it works correctly. There may also be some other bug, like not allowing whitespace between a newline separator and the start of a section header. I couldn't see a bug like this in the code, but it is possible.
Hi Rob,
This is the reply from Hervé.
1) Windows doesn't have a isspaceW method, so I tested the iswspace method, which should be equivalent.
Test program: #include <stdio.h> #include <ctype.h>
int main(int argc, char *argv[]) { int space; space = iswspace(0xfeff); printf("iswspace(0xfeff) = %s\n", space ? "yes" : "no"); space = iswspace(L'a'); printf("iswspace(L'a') = %s\n", space ? "yes" : "no"); space = iswspace(L' '); printf("iswspace(L' ') = %s\n", space ? "yes" : "no"); return 0; }
Output on Windows XP SP2: iswspace(0xfeff) = no iswspace(L'a') = no iswspace(L' ') = yes
Output on ReactOS r18023 (2005/09/24) iswspace(0xfeff) = no iswspace(L'a') = no iswspace(L' ') = yes
Output on Wine 20050830 iswspace(0xfeff) = yes iswspace(L'a') = no iswspace(L' ') = yes
According to these results, I'm sure that the Wine implementation of iswspace is broken for some reason. I encourage Rob to use the above simple test program to make sure it works properly in Wine.
2) 0xfeff/0xfeff at the start of an UNICODE file is a BOM (Byte Order Marking). It tells you in which byte order is written the file. See http://www.elfdata.com/plugin/unicodefaqdata.html#whatisbom You can notice that "... The BOM can only be written at the start of a text file, before any other bytes. ..."
See also http://www.elfdata.com/plugin/unicodefaqdata.html#Z about the Zero-Width Non-breaking Space you were mentionning
3) - else err = parse_buffer( file, buffer, (WCHAR *)((char *)buffer + size), error_line ); + else + { + WCHAR *new_buff = (WCHAR *)buffer; + /* Some UNICODE files may start with the UNICODE marker */ + if (*new_buff == 0xfeff) + new_buff++; + err = parse_buffer( file, new_buff, (WCHAR *)((char *)new_buff + size), error_line ); + }
My patch skips only the BOM (when it exists). The patch is not related to ZWNBSP, so I don't want to use iswspace. I also don't intend to change the whole parsing of .inf file, as it already seems correct to me. If you prefer, I may change the patch to skip this 0xfeff header in parse_buffer (setupapi/parser.c), but I need to add a new state like file_start_state (FILE_START). However, my proposition seems ways lighter.
4) To finish this mail, you're right when you say that the following change is incorrect: WCHAR *new_buff = HeapAlloc( GetProcessHeap(), 0, size * sizeof(WCHAR) ); if (new_buff) { - DWORD len = MultiByteToWideChar( CP_ACP, 0, buffer, size, new_buff, size ); + DWORD len = MultiByteToWideChar( CP_ACP, 0, buffer, size, new_buff, + size * sizeof(WCHAR) ); err = parse_buffer( file, new_buff, new_buff + len, error_line ); HeapFree( GetProcessHeap(), 0, new_buff ); } I shouldn't have done this change
Regards,
Hervé
Steven Edwards wrote:
Hi Rob,
This is the reply from Hervé.
- Windows doesn't have a isspaceW method, so I tested the iswspace
method, which should be equivalent.
Test program: #include <stdio.h> #include <ctype.h>
int main(int argc, char *argv[]) { int space; space = iswspace(0xfeff); printf("iswspace(0xfeff) = %s\n", space ? "yes" : "no"); space = iswspace(L'a'); printf("iswspace(L'a') = %s\n", space ? "yes" : "no"); space = iswspace(L' '); printf("iswspace(L' ') = %s\n", space ? "yes" : "no"); return 0; }
Output on Windows XP SP2: iswspace(0xfeff) = no iswspace(L'a') = no iswspace(L' ') = yes
Output on ReactOS r18023 (2005/09/24) iswspace(0xfeff) = no iswspace(L'a') = no iswspace(L' ') = yes
Output on Wine 20050830 iswspace(0xfeff) = yes iswspace(L'a') = no iswspace(L' ') = yes
According to these results, I'm sure that the Wine implementation of iswspace is broken for some reason. I encourage Rob to use the above simple test program to make sure it works properly in Wine.
I have sent a patch to fix this. It is caused by an incorrect wctype table in Wine. Can you send the changes in this patch again (preferably with my original recommendation to split the patch into the two logically separate changes)?
Thanks,