[Bug 26632] New: MultiByteToWideChar with MB_ERR_INVALID_CHARS doesn't fail for some code points.
http://bugs.winehq.org/show_bug.cgi?id=26632 Summary: MultiByteToWideChar with MB_ERR_INVALID_CHARS doesn't fail for some code points. Product: Wine Version: 1.3.17 Platform: x86 OS/Version: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: kernel32 AssignedTo: wine-bugs(a)winehq.org ReportedBy: sagawa.aki+winebugs(a)gmail.com Created an attachment (id=33899) --> (http://bugs.winehq.org/attachment.cgi?id=33899) test MB_ERR_INVALID_CHARS I ran the attached source code in both Wine and Windows XP environments. In some codepages, inclueding Japanese (CP932), the result doesn't match. For instance, Japanese Windows marks `X' (conversion fail) for 0xA0, 0xFD, 0xFE and 0xFF. But Wine (LANG=ja_JP.UTF-8) marks `o' (OK) for them. This only happens when I pass MB_ERR_INVALID_CHARS for MultiByteToWideChars. This article might be help you. http://blogs.msdn.com/b/michkap/archive/2007/07/25/4037646.aspx -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #1 from Sagawa <sagawa.aki+winebugs(a)gmail.com> 2011-04-02 03:43:01 CDT --- Created an attachment (id=33900) --> (http://bugs.winehq.org/attachment.cgi?id=33900) run result (Windows XP, Windows 7) -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #2 from Sagawa <sagawa.aki+winebugs(a)gmail.com> 2011-04-02 03:45:40 CDT --- Created an attachment (id=33901) --> (http://bugs.winehq.org/attachment.cgi?id=33901) run result (Wine 1.3.17) -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 Sagawa <sagawa.aki+winebugs(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #33899|application/octet-stream |text/plain mime type| | -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #3 from Nikolay Sivov <bunglehead(a)gmail.com> 2011-04-02 04:35:18 CDT --- Is it another difference in Microsoft interpretation of Unicode? -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #4 from Sagawa <sagawa.aki+winebugs(a)gmail.com> 2011-04-02 07:55:23 CDT --- (In reply to comment #3)
Is it another difference in Microsoft interpretation of Unicode?
Probably yes. Their implementation converts some undefined byte character to Unicode Private Use Areas (PUA). It is necessary to do round-trip conversion (e.g. ANSI:0xFF turns into Unicode:U+F8F3, then it should be ANSI:0xFF). Although that, PUA is not a right place to map because Unicode standards does not define characters in that area. Thus MB_ERR_INVALID_CHARS flag doesn't allow to convert to PUA. -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #5 from Nikolay Sivov <bunglehead(a)gmail.com> 2011-04-02 08:47:17 CDT --- AFAIK wine's unicode tables are generated directly from unicode.org data, and I'm sure it already was a request to customize that (case Windows does some things other way). I don't remember why it was rejected - probably because MS tweaks its implementation from release to release and we want to rely on standard data. Somebody experienced in that area could help here. Dmitry? -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 Dan Kegel <dank(a)kegel.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dank(a)kegel.com --- Comment #6 from Dan Kegel <dank(a)kegel.com> 2011-04-02 10:14:04 CDT --- What is the impact on the user of this problem? -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #7 from Sagawa <sagawa.aki+winebugs(a)gmail.com> 2011-04-02 23:10:41 CDT --- (In reply to comment #5)
AFAIK wine's unicode tables are generated directly from unicode.org data, and I'm sure it already was a request to customize that (case Windows does some things other way). The Unicode tables is not a problem. The table doesn't cover MB_ERR_INVALID_CHARS behavior. Without flags, the results (WCHARs) are same as Windows. In my opinion, wine raises an error for the conversion with MB_ERR_INVALID_CHARS to some PUA code points (as Windows does). Currently wine does raise a error when it meets a default Unicode character only.
(In reply to comment #6)
What is the impact on the user of this problem? For instance, encoding detection. In Japanese CP932, 0xFE is an undefined character, not used in normalized sequence. But in EUC-JP (another encoding system), it is a valid code sequence. In Windows, MB_ERR_INVALID_CHARS flag helps this.
-- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 Dmitry Timoshkov <dmitry(a)codeweavers.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 --- Comment #8 from Dmitry Timoshkov <dmitry(a)codeweavers.com> 2011-04-03 00:15:05 CDT --- (In reply to comment #0)
For instance, Japanese Windows marks `X' (conversion fail) for 0xA0, 0xFD, 0xFE and 0xFF. But Wine (LANG=ja_JP.UTF-8) marks `o' (OK) for them.
ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT specifically marks 0xA0, 0xFD, 0xFE and 0xFF as #UNDEFINED, our parser needs to mark those as invalid somehow. -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #9 from Sagawa <sagawa.aki+winebugs(a)gmail.com> 2011-04-03 00:49:12 CDT --- Created an attachment (id=33913) --> (http://bugs.winehq.org/attachment.cgi?id=33913) proposed patch -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #10 from Sagawa <sagawa.aki+winebugs(a)gmail.com> 2011-04-03 01:10:11 CDT --- (In reply to comment #8)
ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT specifically marks 0xA0, 0xFD, 0xFE and 0xFF as #UNDEFINED, our parser needs to mark those as invalid somehow.
Yes that's right. Please note 0x80 is also marked as #UNDEFINED in CP932.TXT. But MultiByteToWideChar with MB_ERR_INVALID_CHARS doesn't complain it. The big difference between them is that 0xA0, 0xFD, 0xFE and 0xFF are mapped into Private Unicode Area in bestfit932.txt[1], 0x80 is not. [1] ... ftp://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit932.txt I wrote a proposed patch using this observation. -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #11 from Dmitry Timoshkov <dmitry(a)codeweavers.com> 2011-04-03 08:43:03 CDT --- (In reply to comment #9)
Created an attachment (id=33913) --> (http://bugs.winehq.org/attachment.cgi?id=33913) [details] proposed patch
This should be done in libs/wine/mbtowc.c,check_invalid_chars_dbcs() instead. -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #12 from Dmitry Timoshkov <dmitry(a)codeweavers.com> 2011-04-03 08:46:18 CDT --- (In reply to comment #11)
Created an attachment (id=33913) --> (http://bugs.winehq.org/attachment.cgi?id=33913) [details] [details] proposed patch
This should be done in libs/wine/mbtowc.c,check_invalid_chars_dbcs() instead.
Hmm, please ignore my comment, somehow I sent a comment to wrong patch, sorry. -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #13 from Sagawa <sagawa.aki+winebugs(a)gmail.com> 2011-04-03 09:35:28 CDT --- Created an attachment (id=33916) --> (http://bugs.winehq.org/attachment.cgi?id=33916) testcase for the patch (In reply to comment #11)
This should be done in libs/wine/mbtowc.c,check_invalid_chars_dbcs() instead.
Thank you for your comments. Third hunk of the patch is for check_invalid_chars_dbcs(). Why did I patch into check_invalid_chars_sbcs()? Because some SBCS codepages, e.g. CP1255, have a same problem. Attached test case examines these behavoir. -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 --- Comment #14 from Dmitry Timoshkov <dmitry(a)codeweavers.com> 2011-04-03 22:59:54 CDT --- Please send to wine-patches (both the test and the fix). -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 Austin English <austinenglish(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED --- Comment #15 from Austin English <austinenglish(a)gmail.com> 2011-04-05 13:15:51 CDT --- Fixed by http://source.winehq.org/git/wine.git/commitdiff/16d57370090a63560e01e646995... -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
http://bugs.winehq.org/show_bug.cgi?id=26632 Alexandre Julliard <julliard(a)winehq.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #16 from Alexandre Julliard <julliard(a)winehq.org> 2011-04-15 12:49:56 CDT --- Closing bugs fixed in 1.3.18. -- Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email Do not reply to this email, post in Bugzilla using the above URL to reply. ------- You are receiving this mail because: ------- You are watching all bug changes.
participants (1)
-
wine-bugs@winehq.org