Consider the following MSVC program: --------------------- cut ------------------------- // PruebaOpenDlg.cpp : Defines the entry point for the console application. //
#include <stdio.h> #include <stdlib.h> #include <errno.h>
#include <windows.h>
int main(int argc, char* argv[]) { OPENFILENAME ofn; // common dialog box structure char szFile[260]; // buffer for file name
// Initialize OPENFILENAME ZeroMemory(&ofn, sizeof(OPENFILENAME)); ofn.lStructSize = sizeof(OPENFILENAME); ofn.hwndOwner = NULL; ofn.lpstrFile = szFile; ofn.nMaxFile = sizeof(szFile); ofn.lpstrFilter = "All\0*.*\0Text\0*.TXT\0"; ofn.nFilterIndex = 1; ofn.lpstrFileTitle = NULL; ofn.nMaxFileTitle = 0; ofn.lpstrInitialDir = NULL; // ofn.Flags = OFN_PATHMUSTEXIST | OFN_FILEMUSTEXIST;
// Display the Open dialog box. memset(szFile, 0, sizeof(szFile)); if (GetOpenFileName(&ofn)==TRUE) { char * p; FILE * hFile;
printf("Chosen filename is: %s\n", ofn.lpstrFile); printf("Byte encoding is :"); for (p = ofn.lpstrFile; *p; p++) { printf(" (%c %02x)", *p, *p); } printf("\n");
hFile = fopen(ofn.lpstrFile, "rb"); if (hFile != NULL) { fclose(hFile); puts("File is readable through specified filename"); } else { printf("Unable to reach file through %s - %s\n", ofn.lpstrFile, strerror(errno)); } } return 0; } --------------------- cut -------------------------
Consider also the following Linux environment: home directory is /home/alex, and is mapped to drive F: in dosdevices. The home directory contains a directory named gatón (the string contains a [U+00F3 LATIN SMALL LETTER O WITH ACUTE] and is UTF-8 encoded as 0x67 0x61 0x74 0xC3 0xB3 0x6E), inside of which a sample file exists, which is to be selected by the Open File dialog. All tests were made in a Fedora Core 4 system with a *default* LANG=es_EC.UTF-8.
The symptom is that, when wine runs with an UTF-8 locale (as specified with the LANG environment variable), and an attempt is made to choose a filename that is UTF-8 encoded in the filesystem, GetOpenFileNameA may return a byte string that CreateFile and other file functions are unable to map into a valid filename. Whether GetOpenFileNameA returns a valid filename or not seems to depend on the way the navigation is performed. That is, if the application starts the Open File dialog from the current directory, and the user navigates by directory change only, the invalid filename will be returned. However, if the user first chooses a drive letter (such as F:) and then navigates from there, the filename returned is a valid one.
The following tests illustrate the behavior. For each entry, the first two lines are the conditions for the test. The remaining three lines are the actual output from the supplied program, copied and pasted from the console. The instances of \uffff seen are from invalid character encodings displayed in the console.
LANG=en_US From current directory /home/alex: Chosen filename is: f:\gatón\Barenaked Ladies - One Week.mp3 Byte encoding is : (f 66) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33) File is readable through specified filename
LANG=en_US From explicit choice from drive F: : Chosen filename is: F:\gatón\Barenaked Ladies - One Week.mp3 Byte encoding is : (F 46) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33) File is readable through specified filename
LANG=es_EC From current directory /home/alex: Chosen filename is: f:\gatón\Barenaked Ladies - One Week.mp3 Byte encoding is : (f 66) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33) File is readable through specified filename
LANG=es_EC From explicit choice from drive F: : Chosen filename is: F:\gatón\Barenaked Ladies - One Week.mp3 Byte encoding is : (F 46) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33) File is readable through specified filename
LANG=es_EC.UTF-8 From current directory /home/alex: Chosen filename is: f:\gatón\Barenaked Ladies - One Week.mp3 Byte encoding is : (f 66) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33) Unable to reach file through f:\gatón\Barenaked Ladies - One Week.mp3 - No such file or directory
LANG=es_EC.UTF-8 From explicit choice from drive F: : Chosen filename is: F:\gat\uffffn\Barenaked Ladies - One Week.mp3 Byte encoding is : (F 46) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff fffffff3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33) File is readable through specified filename
Case 5 is incorrect, but is the easiest to hit in the UTF-8 locales.
This problem is significant because all Fedora distributions since at least Fedora Core 2 have UTF-8 support, which is probably enabled in non-US locales. Other popular distributions probably have this UTF-8 support enabled too. I am posting this on wine-devel instead of creating a bug report because I wanted to receive some comments on what the expected behavior should be before trying to submit a patch myself. Unless somebody says otherwise, I would try to submit a patch that makes case 5 behave like case 6, by modifying the encoding of the ANSI string to match what the file-open functions would expect for the filename. However, this essentially requires an answer to the following question: should non-Unicode strings that represent filenames be UTF-8 encoded, or locale encoded? In the UTF-8 locales, GetOpenFileNameA seems to think UTF-8 encoded sometimes, but the file open functions expect locale-encoded (in my case is ISO-8859-1). Therefore, the incorrect behavior. How would the answer change (if at all) for Chinese or Japanese locales with a need for multibyte characters?
Alex Villacís Lasso