References: http://www.winehq.org/hypermail/wine-patches/2005/08/0265.html http://www.winehq.org/hypermail/wine-devel/2005/08/0286.html
Background:
Work is currently proceeding on a branched version to create additional APIs for WINE that use UNIX path names rather than Windows ones. This is useful for Winelib apps and seeks to make them look more like they are native apps, thereby addressing some of the complaints that Winelib apps are somehow of lesser status than ports using other APIs. After some discussion, it was decided that this would remain a branch until at least such time as the implementation was proven to work in real-world situations.
Part of this involves producing file dialog APIs that operate appropriately in this context. That includes taking UNIX path names on input, producing UNIX path names on output and in callbacks to the application, and browsing a heirarchy that does not include windows-isms such as drive letters.
To make modifications directly in the existing code would result in a set of differences that would result in significant headaches for branch maintenance and any future merge-back to WineHQ. The objective is to reduce the differences so as to improve compatibility between the branches.
The patch referenced above took a minimal-change approach to this problem by implementing an interface that mostly implemented the small operations made by the existing code, without even putting in local stubs (hence the inconsistent calling conventions in the interface).
The general principle I have used is that path names should, as far as possible, be opaque so that the file dialog code itself never examines their contents directly, but rather calls functions in the interface to extract or locate particular portions of the path, to modify or concatenate paths or to make use of the paths.
The minimalist interface:
typedef struct { UINT code_page; WCHAR sep_char;
HRESULT WINAPI (*get_top_folder) (IShellFolder **); LPITEMIDLIST (*get_pidl_from_name)(IShellFolder *, LPWSTR); BOOL (*get_display_name) (LPCITEMIDLIST, LPWSTR); IShellFolder* (*get_folder_from_pidl)(LPITEMIDLIST);
BOOL WINAPI (*change_directory) (LPCWSTR); UINT WINAPI (*get_directory) (UINT, LPWSTR);
void (*qualify_path) (LPWSTR, LPCWSTR); void (*complete_path) (LPWSTR); BOOL (*has_invalid_char) (LPCWSTR); LPWSTR WINAPI (*find_next_component) (LPCWSTR); LPWSTR WINAPI (*find_file_name) (LPCWSTR); LPWSTR WINAPI (*find_extension) (LPCWSTR); BOOL WINAPI (*file_exists) (LPCWSTR); BOOL WINAPI (*is_directory) (LPCWSTR); LPWSTR WINAPI (*add_dir_sep) (LPWSTR); DWORD WINAPI (*get_full_path) (LPCWSTR, DWORD, LPWSTR, LPWSTR*);
} FileDlgFileOps;
Minimalist interface vs ideal interface:
On IRC this morning Alexandre said he would prefer a well-designed interface to the minimalist approach, hence this discussion.
Since the interface is entirely internal to commdlg, I will use cdecl calling conventions.
Detailed discussion follows.
The code_page member was put there because the UNIX file name APIs may not use the same code page as other A entry points. WINE uses CP_ACP (as does Windows - although CP_THREAD_ACP is subject to further investigation) for most purposes, but CP_UNIXCP for UNIX path names when translating them to UTF16. Often CP_UNIXCP will be something like UTF8, or it may be ISO8859-1 in situations where Windows would use CP1252. It may be that a Winelib application should have CP_ACP set to be the UNIX code page, but they may not (and unless they do something special will not.
So places where A->W or W->A conversions happen on path names need to make sure they use the right code page in the context. The minimalist approach was to make this a data member, but the ideal approach would be to have functions which performed the appropriate conversions. Looking through the existing code, the conversions are performed in some contexts where the output is allocated, and others where the output is a fixed size buffer. If we want to have just one conversion function for each direction, this would give us:
CHAR *filename_wtoa(WCHAR const *in, CHAR *out, int bufrange); WCHAR *filename_atow(CHAR const *in, WCHAR *out, int bufrange);
"in" is the input buffer. "out" is the output buffer (NULL if we want the method to allocate). "bufrange" is the number of elements pointed to by "out".
The return value is a pointer to the file name on success, and is either the value of "out", or an allocated buffer where "out" is NULL. On failure the return value is NULL.
The sep_char is perhaps the rudest part of the minimalist interface since it does not treat the path names as opaque. It is used in the following contexts: The handling of CDM_GETFILEPATH, where it is used to paste the file name and directory name together; and in FILEDLG95_InitControls, where it is used to determine if the input file name has a path component.
With an ideal interface the CDM_GETFILEPATH handling would be changed to use a general path qualification function. Determining if the input file name has a path component could be handled in one of two ways: with a method for querying this; or by searching for the start of the file name component and testing if that is the start of the string [ie. find_file_name(input) != input].
I prefer the latter method since it means one less entry in the interface, but if the boolean function were preferred it might appear as:
BOOL has_path(char const *filename);
get_top_folder exists because the Windows path versions of the dialog use the Desktop folder as their top level, but a UNIX path version should arguably use the UNIX root as its top level. Operations retrieving the top level folder (SHGetDesktopFolder) appear in many places. I would prefer to return the pointer though, hence:
IShellFolder *get_top_folder(void);
The next three functions in the minimal interface (get_pidl_from_name; get_display_name; and get_folder_from_pidl) are functions that are already implemented for Windows path names and are used to handle conversions between item ID lists and path names. Unless somebody thinks the existing implementations are in need of reworking, I don't see any reason not to include them as is in the interface.
The next two functions (change_directory; and get_directory) are currently direct calls to SetCurrentDirectoryW and GetCurrentDirectoryW in the default implementation of the interface. In the UNIX path versions there is some difficulty in how these should be handled for reasons that are too complex to go into here, but by having the interface at least a start can be made on figuring out how to deal with these. GetCurrentDirectoryW is usually called in contexts where the buffer is allocated (albeit wastefully), but in one case is called on a stack buffer. I am inclined to have the stack buffer replaced by an allocated one, hence:
BOOL set_directory(WCHAR const *dir); WCHAR *get_directory(void);
Next comes qualify_path, which is used to generate a fully qualified and canonical (no '/../' sequences) path name given a directory name and a (possibly already fully qualified) path name. The minimalist declaration is based on the way this operation was implemented in FILEDLG95_OnOpen, but in accordance with my preference for allocating string buffers I would prefer:
WCHAR *qualify_path(WCHAR const *path, WCHAR const *dir);
It may be that if dir == NULL the function would use the current directory.
complete_path is only used in one place - in FILEDLG95_OnOpen where the routine walks through its path elements, and tacks on a trailing backslash to paths like "c:" (which will be the first component of the path for "c: \windows\system.ini"). find_next_component (currently set to PathFindNextComponentW) is only used in that same context, so perhaps the better solution is to combine these two with:
WCHAR *next_component(WCHAR const *path, WCHAR const *last);
"last" is the most recent return value (or NULL for the first call). "path" is the input path.
The return value is an allocated string containing the next path component, so you would get "c:", "windows", "system.ini" as return values.
has_invalid_char is used to test if the path name contains any invalid characters. The current default implementation is wrong, but reflects what was already there. The general rule is that Win32 paths (at least in the file dialog) should not contain '/', ':' (except as a drive letter (*)), '<'. '>', and '|'. IIRC, wild cards are forbidden in file names, but the filedlg code does not treat them as invalid because they are valid for entry into the edit box of the file dialog. Under UNIX, there are no invalid file name characters - any string is a valid file name although it may be a relative file name. I would keep this function but rename it:
BOOL *valid_file_name(WCHAR const *filename);
(*) can the file dialogs address stream names under NT? Is it meaningful to do the same under Wine since UNIX has no concept of file sub-streams?
find_file_name and find_extension find the final path component, and the extension (if any) in the file name. These are fairly pure functions for handling otherwise opaque path names, so they would remain as is but without the WINAPI calling convention:
WCHAR *find_file_name(WCHAR *filename); WCHAR *find_extension(WCHAR *filename);
Next come file_exists and is_directory. These could be replaced by a single routine that requests the type of the file and returns -1 if the file does not exist:
int get_file_type(WCHAR const *filename);
Returns (with optional symbolic constants):
-1: no such file 0: ordinary file 1: directory
This would also simplify some other code where is_directory is called immediately after file_exists.
add_dir_sep is used when pasting a path name and a wildcard string. If there are no objections, qualify_path could be used for such situations, thereby avoiding the need for a separate add_dir_sep.
get_full_path is currently a call to GetFullPathNameW. It is used in 3 places in FILEDLG95_InitControls. Each time it is used to do three things: 1. Convert any 8.3 filenames to the long path name version, then extract the file name portion (if any) and directory name portion of the resulting long path, storing them in separate locations. I would simplify the call, allocating the result:
WCHAR *get_full_path(WCHAR const *path);
The file name component would then be discovered by a call to find_file_name.
Obvious candidates for changes:
1. find_extension could be implemented using find_file_name in a way that obviates the need for a separate find_extension. 2. Might find_file_name be implemented in terms of next_component? This likely depends on the behaviour or next_component with a path like "f:\windows" - if it only returns "f:" and "windows" then it would not be suitable, but if it returns "f:", "windows" and "" then it would.
Proposed interface:
typedef struct { CHAR *(*filename_wtoa)(WCHAR const *in, CHAR *out, int bufrange); WCHAR *(*filename_atow)(CHAR const *in, WCHAR *out, int bufrange);
IShellFolder *(*get_top_folder)(void); LPITEMIDLIST (*get_pidl_from_name)(IShellFolder *, LPWSTR); BOOL (*get_display_name)(LPCITEMIDLIST, LPWSTR); IShellFolder* (*get_folder_from_pidl)(LPITEMIDLIST);
BOOL (*set_directory)(WCHAR const *dir); WCHAR *(*get_directory)(void); int (*get_file_type)(WCHAR const *filename);
WCHAR *(*qualify_path)(WCHAR const *path, WCHAR const *dir); WCHAR *(*get_full_path)(WCHAR const *filename);
WCHAR *(*next_component)(WCHAR const *path, WCHAR const *last); WCHAR *(*find_file_name)(WCHAR *filename); WCHAR *(*find_extension)(WCHAR *filename);
BOOL *(*valid_file_name)(WCHAR const *filename); } FileDlgFileOps;