There are a group of functions in the kernel that appear to be broken in that they don't follow the Windows API specification. The general convention for these functions is as follows:
The function is called with a pointer to a buffer and the size of the buffer, and returns the number of characters copied to the buffer, excluding the null terminator, unless the buffer is too small when the total size in characters of the buffer needed (with a null terminator) is returned. If the buffer is too small, the string is not copied at all.
Several functions that I have spotted don't implement this correctly - particularly ones in files/directory.c. eg GetSystemDirectory. They count the Null Terminator. A typical symptom of this is that an installation program attempts to call LoadLibrary on "C:\WINDOWS\SYSTEM", because it's appended eg "OLE32.DLL" after the null character. This is one of the problems when installing the Microsoft installer, for example.
While looking for other functions that might do the same thing, I came across this: /*********************************************************************** * GetCurrentDirectoryW (KERNEL32.@) */ UINT WINAPI GetCurrentDirectoryW( UINT buflen, LPWSTR buf ) { LPSTR xpath = HeapAlloc( GetProcessHeap(), 0, buflen+1 ); UINT ret = GetCurrentDirectoryA( buflen, xpath ); if (ret < buflen) ret = MultiByteToWideChar( CP_ACP, 0, xpath, -1, buf, buflen ) - 1; HeapFree( GetProcessHeap(), 0, xpath ); return ret; }
But HeapAlloc takes a number of bytes, whereas buflen is in (two-byte) characters. ie a small buffer would appear to crash this code. (right?)
While I have succesfully patched these bugs on my machine, I'm not sure that going through and looking for every function that may be broken in some similar way and changing them one-by-one is a good way to proceed. These bugs all have to do with returning strings - it would probably be better to attempt to have functions that 'copy out an ANSI string' or 'copy out a UNICODE string' according to this API. Without meaning to be critical, the problem appears to be that the same functionality has been written many times throughout the code, and so multiple and different bugs have crept in.
However, I'm quite new to using wine, and so would like the advice and opinions of those more experienced. My questions: 1) Is there a design reason why a single common function for string copying is not used? Maybe something to do with separate libraries, some of which may be replaced by native windows DLLs? Maybe someone more knowledgeable than myself could direct me as to the best place to put the helper function implementations. 2) Would writing these string helper functions and changing (most) string-returning functions to use them be worthwhile (or indeed even welcome)? 3) Is there a move to using UNICODE internally in the kernel that means that this would be best combined with some other effort? Certainly if this were an eventual design goal then using helper functions would make that job a little easier.
Comments?
Justin SB
"Justin" == Justin Santa Barbara justinsb@hotmail.com writes:
Justin> There are a group of functions in the kernel that appear to be Justin> broken in that they don't follow the Windows API specification. Justin> The general convention for these functions is as follows: The Justin> function is called with a pointer to a buffer and the size of Justin> the buffer, and returns the number of characters copied to the Justin> buffer, excluding the null terminator, unless the buffer is too Justin> small when the total size in characters of the buffer needed Justin> (with a null terminator) is returned. If the buffer is too Justin> small, the string is not copied at all.
Justin> Several functions that I have spotted don't implement this Justin> correctly - particularly ones in files/directory.c. eg Justin> GetSystemDirectory. They count the Null Terminator.
GetSystemDirectoryA returns strlen( DIR_System.short_name ) This doesn't count the Null Terminator to my knowledge.
Justin> A typical Justin> symptom of this is that an installation program attempts to call Justin> LoadLibrary on "C:\WINDOWS\SYSTEM", because it's appended eg Justin> "OLE32.DLL" after the null character. This is one of the Justin> problems when installing the Microsoft installer, for example.
As said above, I don't see that problem. Maybe the implementation is not totally conforming in that it copies characters to the available buffer even if the whole name does not fits into the buffer, but the count seems right.
Justin> While looking for other functions that might do the same thing, Justin> I came across this: Justin> /*********************************************************************** Justin> * GetCurrentDirectoryW (KERNEL32.@) */ UINT WINAPI Justin> GetCurrentDirectoryW( UINT buflen, LPWSTR buf ) { LPSTR xpath = Justin> HeapAlloc( GetProcessHeap(), 0, buflen+1 ); UINT ret = Justin> GetCurrentDirectoryA( buflen, xpath ); if (ret < buflen) ret = Justin> MultiByteToWideChar( CP_ACP, 0, xpath, -1, buf, buflen ) - 1; Justin> HeapFree( GetProcessHeap(), 0, xpath ); return ret; }
Justin> But HeapAlloc takes a number of bytes, whereas buflen is in Justin> (two-byte) characters. ie a small buffer would appear to crash Justin> this code. (right?)
No, as long as GetCurrentDirectoryA behaves according to the API (wich it seemingly does), the right value is returned.
I think it is usefull to write small test programs, challenging an (file) API to print out the returned strings and compile it with e.g. mingw and let it run on Linux and Windows(perhaps in VMWare).
Justin> While I have succesfully patched these bugs on my machine, I'm Justin> not sure that going through and looking for every function that Justin> may be broken in some similar way and changing them one-by-one Justin> is a good way to proceed. These bugs all have to do with Justin> returning strings - it would probably be better to attempt to Justin> have functions that 'copy out an ANSI string' or 'copy out a Justin> UNICODE string' according to this API. Without meaning to be Justin> critical, the problem appears to be that the same functionality Justin> has been written many times throughout the code, and so multiple Justin> and different bugs have crept in.
A lot of work has been done to get things right in wine/files/.... So probably a some bugs remained and some functionality is duplicated.
Justin> However, I'm quite new to using wine, and so would like the Justin> advice and opinions of those more experienced. My questions: 1) Justin> Is there a design reason why a single common function for string Justin> copying is not used? Maybe something to do with separate Justin> libraries, some of which may be replaced by native windows DLLs? Justin> Maybe someone more knowledgeable than myself could direct me as Justin> to the best place to put the helper function implementations.
Wine code grows since about more then 7 years with over 150 known authors, and until noone does a big cleanup there is no clear design. And often the Win API has no clear design, so a lot of bells and whistles have to be observed, probably best done by (nearly) duplivating code. Also you must observe DLL separation and restrict yourself to common C-Lib functions. If you need some helper functions in the files handling, it best resided in ../wine/files.
Justin> 2) Would writing these string helper functions and changing Justin> (most) string-returning functions to use them be worthwhile (or Justin> indeed even welcome)?
If done right and tested, it seems a good thing. However as said above, it the functions you mention I don't see the error you observe.
Justin> 3) Is there a move to using UNICODE Justin> internally in the kernel that means that this would be best Justin> combined with some other effort? Certainly if this were an Justin> eventual design goal then using helper functions would make that Justin> job a little easier.
It is aggreed, that Wine should implement the WINNT way of WIN32. However the Win31/32s/95/98/... side should not be neglected. Historically Win31 and its decendants where normaly implemented first.
So for the wine/files/... functions, the ...W functions should be implemented genuine and the ..A functions should chunk down to that code.
Justin> Comments?
For more discussions on wine/files/... search the archives. There have been some definite words from Alexandre too
Justin> Justin SB <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Please no HTML in mail, and set the linebreaks.
Bye
Thanks for the quick reply Uwe. You're absolutely correct in that what I stated in my email wasn't right. I'm pretty sure that there is a bug - I'll try to describe it correctly this time! And I'll put bug in quotes every time, just in case I'm still wrong...
The "bug" that I originally spotted was in GetSystemDirectoryW. /*********************************************************************** * GetSystemDirectoryW (KERNEL32.@) */ UINT WINAPI GetSystemDirectoryW( LPWSTR path, UINT count ) { UINT len = MultiByteToWideChar( CP_ACP, 0, DIR_System.short_name, -1, NULL, 0 );
if (path && count) { if (!MultiByteToWideChar( CP_ACP, 0, DIR_System.short_name, -1, path, count )) path[count-1] = 0; } return len;
}
The problem is that MultiByteToWideChar does count the null terminator, if param 4 (the multibyte string length) is -1 indicating a null-terminated string.
Hence this code needs to be changed to return instead: if (len <= count) return len - 1; else return len;
Then I had a look at GetSystemDirectoryA:
UINT WINAPI GetSystemDirectoryA( LPSTR path, UINT count ) { if (path) lstrcpynA( path, DIR_System.short_name, count ); return strlen( DIR_System.short_name ); }
You're right that strlen doesn't count the terminating null, but the problem now is that the terminating null does need to be counted if the buffer is too small (it should return the size of the buffer required to hold the path). Sorry I didn't explain that correctly. So I think we need: len = strlen() if (len < count) return len; else return len+1;
So the "bug" on GetSystemDirectoryW is that it always counts the null-terminator, and on GetSystemDirectoryA that it never counts it. It should be counted iff the buffer is too small.
This was stopping the installer working when I had set winver=nt2k, so I was using the (more "buggy") W rather than the less serious A.
Finally, with GetCurrentDirectoryW, I was indeed wrong about the chance of a buffer overflow. But does it return the right value? I think if buflen = 0 then GetCurrentDirectoryW returns the same value as in GetCurrentDirectoryA, but if the ANSI string has a multi-byte symbol I should get a larger value from GetCurrentDirectoryA vs GetCurrentDirectoryW (as more bytes than characters).
As I see it, the advantage of a helper function is that for about the same amount of work of fixing GetSystemDirectory and GetCurrentDirectory, all the functions should work the same way meaning any bugs would quickly be found. It might also be possible to optimize the case where buflen=0, while this isn't practical per-function. Should I offer a patch only for wine/files/* ?
Justin SB
----- Original Message ----- From: "Uwe Bonnes" bon@elektron.ikp.physik.tu-darmstadt.de To: "Justin Santa Barbara" justinsb@hotmail.com Cc: wine-devel@winehq.com Sent: Monday, September 03, 2001 9:44 PM Subject: Re: Bugs involving returning strings from API functions
"Justin" == Justin Santa Barbara justinsb@hotmail.com writes:
Justin> There are a group of functions in the kernel that appear to be Justin> broken in that they don't follow the Windows API
specification.
Justin> The general convention for these functions is as follows: The Justin> function is called with a pointer to a buffer and the size of Justin> the buffer, and returns the number of characters copied to the Justin> buffer, excluding the null terminator, unless the buffer is
too
Justin> small when the total size in characters of the buffer needed Justin> (with a null terminator) is returned. If the buffer is too Justin> small, the string is not copied at all. Justin> Several functions that I have spotted don't implement this Justin> correctly - particularly ones in files/directory.c. eg Justin> GetSystemDirectory. They count the Null Terminator.
GetSystemDirectoryA returns strlen( DIR_System.short_name ) This doesn't count the Null Terminator to my knowledge.
Justin> A typical Justin> symptom of this is that an installation program attempts to
call
Justin> LoadLibrary on "C:\\WINDOWS\\SYSTEM", because it's appended eg Justin> "OLE32.DLL" after the null character. This is one of the Justin> problems when installing the Microsoft installer, for example.
As said above, I don't see that problem. Maybe the implementation is not totally conforming in that it copies characters to the available buffer
even
if the whole name does not fits into the buffer, but the count seems
right.
Justin> While looking for other functions that might do the same
thing,
Justin> I came across this: Justin>
/***********************************************************************
Justin> * GetCurrentDirectoryW (KERNEL32.@) */ UINT WINAPI Justin> GetCurrentDirectoryW( UINT buflen, LPWSTR buf ) { LPSTR xpath
=
Justin> HeapAlloc( GetProcessHeap(), 0, buflen+1 ); UINT ret = Justin> GetCurrentDirectoryA( buflen, xpath ); if (ret < buflen) ret = Justin> MultiByteToWideChar( CP_ACP, 0, xpath, -1, buf, buflen ) - 1; Justin> HeapFree( GetProcessHeap(), 0, xpath ); return ret; } Justin> But HeapAlloc takes a number of bytes, whereas buflen is in Justin> (two-byte) characters. ie a small buffer would appear to
crash
Justin> this code. (right?)
No, as long as GetCurrentDirectoryA behaves according to the API (wich it seemingly does), the right value is returned.
I think it is usefull to write small test programs, challenging an (file) API to print out the returned strings and compile it with e.g. mingw and
let
it run on Linux and Windows(perhaps in VMWare).
Justin> While I have succesfully patched these bugs on my machine, I'm Justin> not sure that going through and looking for every function
that
Justin> may be broken in some similar way and changing them one-by-one Justin> is a good way to proceed. These bugs all have to do with Justin> returning strings - it would probably be better to attempt to Justin> have functions that 'copy out an ANSI string' or 'copy out a Justin> UNICODE string' according to this API. Without meaning to be Justin> critical, the problem appears to be that the same
functionality
Justin> has been written many times throughout the code, and so
multiple
Justin> and different bugs have crept in.
A lot of work has been done to get things right in wine/files/.... So probably a some bugs remained and some functionality is duplicated.
Justin> However, I'm quite new to using wine, and so would like the Justin> advice and opinions of those more experienced. My questions:
1)
Justin> Is there a design reason why a single common function for
string
Justin> copying is not used? Maybe something to do with separate Justin> libraries, some of which may be replaced by native windows
DLLs?
Justin> Maybe someone more knowledgeable than myself could direct me
as
Justin> to the best place to put the helper function implementations.
Wine code grows since about more then 7 years with over 150 known authors, and until noone does a big cleanup there is no clear design. And often the Win API has no clear design, so a lot of bells and whistles have to be observed, probably best done by (nearly) duplivating code. Also you must observe DLL separation and restrict yourself to common C-Lib functions. If you need some helper functions in the files handling, it best resided in
../wine/files.
Justin> 2) Would writing these string helper functions and changing Justin> (most) string-returning functions to use them be worthwhile
(or
Justin> indeed even welcome)?
If done right and tested, it seems a good thing. However as said above, it the functions you mention I don't see the error you observe.
Justin> 3) Is there a move to using UNICODE Justin> internally in the kernel that means that this would be best Justin> combined with some other effort? Certainly if this were an Justin> eventual design goal then using helper functions would make
that
Justin> job a little easier.
It is aggreed, that Wine should implement the WINNT way of WIN32. However the Win31/32s/95/98/... side should not be neglected. Historically Win31
and
its decendants where normaly implemented first.
So for the wine/files/... functions, the ...W functions should be implemented genuine and the ..A functions should chunk down to that code.
Justin> Comments?
For more discussions on wine/files/... search the archives. There have
been
some definite words from Alexandre too
Justin> Justin SB <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Please no HTML in mail, and set the linebreaks.
Bye
Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de
Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt --------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------
"Justin" == Justin Santa Barbara justinsb@hotmail.com writes:
Justin> Thanks for the quick reply Uwe. You're absolutely correct in Justin> that what I stated in my email wasn't right. I'm pretty sure ...
Justin> The problem is that MultiByteToWideChar does count the null Justin> terminator, if param 4 (the multibyte string length) is -1 Justin> indicating a null-terminated string.
I didn't consider MultiByteToWideChar.
Justin> Hence this code needs to be changed to return instead: if (len Justin> <= count) return len - 1; else return len;
Justin> Then I had a look at GetSystemDirectoryA:
Justin> UINT WINAPI GetSystemDirectoryA( LPSTR path, UINT count ) { if Justin> (path) lstrcpynA( path, DIR_System.short_name, count ); return Justin> strlen( DIR_System.short_name ); }
You're right too.
Justin> You're right that strlen doesn't count the terminating null, but Justin> the problem now is that the terminating null does need to be Justin> counted if the buffer is too small (it should return the size of Justin> the buffer required to hold the path). Sorry I didn't explain Justin> that correctly. So I think we need: len = strlen() if (len < Justin> count) return len; else return len+1;
Justin> ...
Justin> As I see it, the advantage of a helper function is that for Justin> about the same amount of work of fixing GetSystemDirectory and Justin> GetCurrentDirectory, all the functions should work the same way Justin> meaning any bugs would quickly be found. It might also be Justin> possible to optimize the case where buflen=0, while this isn't Justin> practical per-function. Should I offer a patch only for Justin> wine/files/* ?
I think that everybody feared to write a general helper function and break a lot ofr special cases. But if done right and tested well, its a good thing.
Write it first for wine/files/*, if it proved usefull, it will find more uses.
>> >>>>> "Justin" == Justin Santa Barbara justinsb@hotmail.com writes: >> Justin> There are a group of functions in the kernel that appear to be
Another hint: It is well known habit to cite first, editing away unneeded parts and then to put one own text. If the whole message gets attached unattended, it will blew up the text length.
Bye