[Bug 39297] New: kernel32.IsValidCodePage and friends don't support code page 708.
https://bugs.winehq.org/show_bug.cgi?id=39297 Bug ID: 39297 Summary: kernel32.IsValidCodePage and friends don't support code page 708. Product: Wine Version: unspecified Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: kernel32 Assignee: wine-bugs(a)winehq.org Reporter: htl10(a)users.sourceforge.net Distribution: --- Microsoft's FontValidator (https://www.microsoft.com/typography/FontValidator.mspx) when processing certain fonts, try to access code page 708, which is not available under wine. FYI, mono has support for code page 708 - c.f. mcs/class/I18N/Rare/CP708.cs . -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 Hin-Tak Leung <htl10(a)users.sourceforge.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Version|unspecified |1.7.51 --- Comment #1 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- FYI, the list of supported codepage by wine seems to be just wine/libs/wine/c_*.c -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 Anastasius Focht <focht(a)gmx.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |dotnet, download URL| |https://www.microsoft.com/t | |ypography/FontValidator.msp | |x CC| |focht(a)gmx.net Summary|kernel32.IsValidCodePage |Microsoft FontValidator |and friends don't support |(.NET 2.0 app) wants code |code page 708. |page 708 when processing | |certain fonts -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 Hin-Tak Leung <htl10(a)users.sourceforge.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords|dotnet | Summary|Microsoft FontValidator |kernel32.IsValidCodePage |(.NET 2.0 app) wants code |and friends don't support |page 708 when processing |code page 708. |certain fonts | --- Comment #2 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- dotnet is not relevant. I can supply native code test app if that helps. The mention of mono is just to indicate a source of information, as there seems to be no standard to this encoding. FWIW, GNU libc also support this encoding: $ iconv -l | grep -i 708 ASMO-708// -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #3 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- The application calls kernel32.IsValidCodePage() and friends via a mixed mode assembly, and that part is native code. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #4 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- The list of functions which FontValidator tries to access are: BOOL kernel32.GetCPInfo( UINT CodePage, LPCPINFO lpCPInfo ) BOOL kernel32.IsValidCodePage( UINT CodePage ) BOOL kernel32.IsDBCSLeadByteEx( UINT CodePage, BYTE TestChar ) int kernel32.MultiByteToWideChar( UINT CodePage, DWORD dwFlags, LPCSTR lpMultiByteStr, int cbMultiByte, LPWSTR lpWideCharStr, int cchWideChar ) As one can see, all of them contains a codepage argument. For most (all?) codepages, these eventually traces to one of wine/libs/wine/c_*.c . for example, for code page 950 (traditional chinese), there is a file wine/libs/wine/c_950.c . Whereas c_950.c and friends are derived from public sources on www.unicode.org, I cannot (yet) find an authoritative source of what code page 708 is, although both glibc and mono claims to supports this code page in their internationalization support. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #5 from Nikolay Sivov <bunglehead(a)gmail.com> --- The question is does this tool work on Windows for same fonts? (please name them if it's possible, I can test it if you don't have access to Windows machine). Another easy thing to try is to add test with this cp number for GetCPInfoEx for example, to see if it works at all and if it does see what's returned as a name. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #6 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- (In reply to Nikolay Sivov from comment #5)
The question is does this tool work on Windows for same fonts? (please name them if it's possible, I can test it if you don't have access to Windows machine).
Of course it does - it is a microsoft tool! e.g. on Windows 8.1 (which comes with this code page) - if you use the tool to analyse window 8.1's tahoma.ttf, in the generated report it would say something like: A CodePage bit is set in ulCodePageRange, but the font is missing some of the printable characters from that codepage, bit #61, Arabic; ASMO 708 (49 missing, first ten missing chars are: U2502 U2524 U2561 U2562 U2556 U2555 U2563 U2551 U2557 U255D) but running under wine + dotnet 2, it would say something about code page not installed. Note that: 1. for some unknown reason, it does not want to navigate to c:\windows\fonts on windows 8.1 - but you can copy tahoma.ttf onto desktop to test. 2. some part of it does not work with wine + wine-mono, though this part does. You should de-select all of the table tests except OS/2, and also choose to "save report file" to a location of your choice, rather than the default "open after analyis" (from temp location), if you are testing under wine + wine-mono or wine + dotnet .
Another easy thing to try is to add test with this cp number for GetCPInfoEx for example, to see if it works at all and if it does see what's returned as a name.
What is the purpose of this test? I already told you what the problem is, and where the bulk of new code need to be added - there needs to be a new "wine/libs/wine/c_708.c" file, and an extra line "wine/libs/wine/cptable.c" to register the new code page table, just like any of the others! -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #7 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- According to https://www.microsoft.com/typography/otspec/os2.htm#cpr code page 708 is "Arabic; ASMO 708". -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #8 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- note that code page 708 is not the same as code page 1256, Arabic, which wine already supports. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 Dmitry Timoshkov <dmitry(a)baikal.ru> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #9 from Dmitry Timoshkov <dmitry(a)baikal.ru> --- http://www.unicode.org/Public/MAPPINGS doesn't have a ready to use mapping for code page 708, however according to http://coq.no/character-tables/dos/en: "This encoding is an almost compatible superset of ISO 8859/6 (all Arabic letters in the same positions, only one incompatible assignment, adding line-drawing characters and lowercase French accented lowercase vowels)". So, codepage 28596 (ISO 8859-6 Arabic) probably could be used and an alias/ replacement. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #10 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- (In reply to Dmitry Timoshkov from comment #9)
http://www.unicode.org/Public/MAPPINGS doesn't have a ready to use mapping for code page 708, however according to http://coq.no/character-tables/dos/en: "This encoding is an almost compatible superset of ISO 8859/6 (all Arabic letters in the same positions, only one incompatible assignment, adding line-drawing characters and lowercase French accented lowercase vowels)". So, codepage 28596 (ISO 89-6 Arabic) probably could be used and an alias/ replacement.
"almost compatible superset" is a gross understatement and misdirection. I found 28 code point differences and 45 additions out of 256. Depends on how you count it, 28 (10%) or 73 (30%) incompatible is not "almost" compatible. "Almost compatible" is a joke. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #11 from Dmitry Timoshkov <dmitry(a)baikal.ru> --- (In reply to Hin-Tak Leung from comment #10)
http://www.unicode.org/Public/MAPPINGS doesn't have a ready to use mapping for code page 708, however according to http://coq.no/character-tables/dos/en: "This encoding is an almost compatible superset of ISO 8859/6 (all Arabic letters in the same positions, only one incompatible assignment, adding line-drawing characters and lowercase French accented lowercase vowels)". So, codepage 28596 (ISO 89-6 Arabic) probably could be used and an alias/ replacement.
"almost compatible superset" is a gross understatement and misdirection. I found 28 code point differences and 45 additions out of 256. Depends on how you count it, 28 (10%) or 73 (30%) incompatible is not "almost" compatible. "Almost compatible" is a joke.
It depends on what you think is most useful part of the table. Since apparently that's Arabic alphabet then it's a good solution, if you need something else then it's worth probably at least mention that. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #12 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- (In reply to Dmitry Timoshkov from comment #11)
It depends on what you think is most useful part of the table. Since apparently that's Arabic alphabet then it's a good solution, if you need something else then it's worth probably at least mention that.
The Fontvalidator already tests for arabic code page 1256, which I believe is iso 8859-6, in bit 6. Bit 61 is something else. This is a tool for testing compliance to an iso specification 14496-22 (the open type format), so nothing less than exact match is good enough. 'useful' and '30% incompatible' isn't good enough, when testing for compliance to an iso standard. '30% incompatible' is not compatible, as far as standard compliance is concerned. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #13 from Dmitry Timoshkov <dmitry(a)baikal.ru> --- (In reply to Hin-Tak Leung from comment #12)
It depends on what you think is most useful part of the table. Since apparently that's Arabic alphabet then it's a good solution, if you need something else then it's worth probably at least mention that.
The Fontvalidator already tests for arabic code page 1256, which I believe is iso 8859-6, in bit 6. Bit 61 is something else. This is a tool for testing compliance to an iso specification 14496-22 (the open type format), so nothing less than exact match is good enough.
'useful' and '30% incompatible' isn't good enough, when testing for compliance to an iso standard. '30% incompatible' is not compatible, as far as standard compliance is concerned.
If you could provide a reference to the code page table compatible with the format of unicode.org tables or the one that could be easily adapted that would be great. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #14 from Nikolay Sivov <bunglehead(a)gmail.com> --- That one maybe https://msdn.microsoft.com/en-us/library/cc195061.aspx ? Chars below 0x20 are mapped directly presumably. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #15 from Dmitry Timoshkov <dmitry(a)baikal.ru> --- (In reply to Nikolay Sivov from comment #14)
That one maybe https://msdn.microsoft.com/en-us/library/cc195061.aspx ? Chars below 0x20 are mapped directly presumably.
That's not really a table but rather a chart. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #16 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- Created attachment 52399 --> https://bugs.winehq.org/attachment.cgi?id=52399 encoding code point to unicode mapping table of code page 708 This is to be best of my knowledge the mapping table of code page 708. It is derived from hexdump'ing c_708.nls from the typical windows installation and some on-line description about what the format of the nls file type might be. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #17 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- BTW, the nls file format seems to be undocumented, and wine therefore cannot/does not make use of them, but that's the subject matter of bug 39298 - "kernel32 does not support custom nls installation". -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #18 from Dmitry Timoshkov <dmitry(a)baikal.ru> --- (In reply to Hin-Tak Leung from comment #16)
Created attachment 52399 [details] encoding code point to unicode mapping table of code page 708
This is to be best of my knowledge the mapping table of code page 708. It is derived from hexdump'ing c_708.nls from the typical windows installation and some on-line description about what the format of the nls file type might be.
Unfortunately I don't think that Wine can use this (and attaching such dump may violate a copyright), there is a reason why Wine uses unicode.org tables. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #19 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- (In reply to Nikolay Sivov from comment #14)
That one maybe https://msdn.microsoft.com/en-us/library/cc195061.aspx ? Chars below 0x20 are mapped directly presumably.
That chart seems to be wrong or outdated. In modern windows, code point 243 - 248 are definitely used, AFAIK. See the attached table to this bug. On real windows, you can probably write a little program to dump the mapping table for code page 708, by doing something like this: ------------------------------------------ char inputbuf[1]; wchar_t outputbuf[1]; UINT CodePage = 708 ; for (ushort c = 0; c<256; c++) { inputbuf[0] = (char)c; if (MultiByteToWideChar(CodePage, MB_ERR_INVALID_CHARS, inputbuf, 1, outputbuf, 1) != 0) { // dump c and outputbuf[0] in two columns } } ----------------------------------------- and hopefully it should result in something identical to what I attached. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #20 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- (In reply to Dmitry Timoshkov from comment #18) ...
Unfortunately I don't think that Wine can use this (and attaching such dump may violate a copyright), there is a reason why Wine uses unicode.org tables.
That's the "authoritative" source I derived the table from; as I wrote in previous comments, both mono and glibc claim to support code page 708 so they must somehow got a listing of the mapping table, or equivalent, from somewhere. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #21 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- On glibc system (i.e linux), yu can get at more or less the same mapping table by doing something like this: perl -e 'for ($i = 0; $i<256; $i++) {printf "%c\n", ($i);}' \ | iconv -t UTF16 -f ASMO-708 -c | hexdump -C -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #22 from Austin English <austinenglish(a)gmail.com> --- The content of attachment 52399 has been deleted for the following reason: Reverse engineered from Windows dll -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #23 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- (In reply to Austin English from comment #22)
The content of attachment 52399 [details] has been deleted for the following reason:
Reverse engineered from Windows dll
FWIW, nls files are not dll's - it does not seem to be officially documented, but there is an on-line description about its format, and it is basically a header plus two arrays, one for raw encoding and another for console output (they differ by just the unprintable characters below 0x20, I think ), and the reverse encoding table from unicode back to code points. So for single-byte encodings (i.e. non-CJK), somewhat after the beginning of the file is simply an array of 512 bytes, telling you how 0-255 are mapped to unicode (UTF16). It is hardly "reverse-engineering" if you simply read 2 x 256 bytes and write it out. The array is quite easy to spot, because for single-byte encodings, ascii's are mapped to asciis, so the alignment is just that with the higher byte of an UTF16 padded with nulls. wine's source never documented how wine/loader/l_intl.nls was made, but if the logic of making it (i.e. writing it) is implemented and extended into reading such files also, you can close this bug as duplicate of - bug 39298 - "kernel32 does not support custom nls installation". since wine being capable of reading nls files, would also mean that one can install code page 708 as an add-on. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #24 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- (In reply to Hin-Tak Leung from comment #19) Alternatively, if you fill out the program below, and make running it under wine matches the result of running it under windows, would that be considered reverse-engineering? All it does is to make one array inside wine matches, so you can probably do it "clean-room" style, building the array, one element at a time.
On real windows, you can probably write a little program to dump the mapping table for code page 708, by doing something like this:
------------------------------------------ char inputbuf[1]; wchar_t outputbuf[1]; UINT CodePage = 708 ;
for (ushort c = 0; c<256; c++) { inputbuf[0] = (char)c; if (MultiByteToWideChar(CodePage, MB_ERR_INVALID_CHARS, inputbuf, 1, outputbuf, 1) != 0) { // dump c and outputbuf[0] in two columns }
} -----------------------------------------
and hopefully it should result in something identical to what I attached.
-- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #25 from Austin English <austinenglish(a)gmail.com> --- (In reply to Hin-Tak Leung from comment #24)
(In reply to Hin-Tak Leung from comment #19)
Alternatively, if you fill out the program below, and make running it under wine matches the result of running it under windows, would that be considered reverse-engineering?
Questions like this belong on wine-devel, not everyone reads wine-bugs. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #26 from Hin-Tak Leung <htl10(a)users.sourceforge.net> --- Here is the full working version of the test code: ----------------- #include <wtypes.h> #include <stdio.h> int main(void) { char inputbuf[1]; wchar_t outputbuf[1]; UINT CodePage = 708; for (unsigned short c = 0; c<256; c++) { inputbuf[0] = (char)c; if (MultiByteToWideChar(CodePage, MB_ERR_INVALID_CHARS, inputbuf, 1, outputbuf, 1) != 0) { printf("0x%02X 0x%04X\n", c, outputbuf[0]); } } return 0; } ----------------- You can cross-compile it with: i686-w64-mingw32-gcc -Wall -o test.exe cp708test.c then run with wine test.exe It just writes the mapping table out as a two-column table. Change 708 to 1252 to test for code page 1252, etc if you wish. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 --- Comment #27 from Nikolay Sivov <bunglehead(a)gmail.com> --- This was added with https://source.winehq.org/git/wine.git/?a=commit;h=1ca4536f7edcc884711a7dab3.... One thing still missing is textual name that GetCPInfoEx() returns, I'll send another patch for that. Please retest. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 Nikolay Sivov <bunglehead(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Fixed by SHA1| |1ca4536f7edcc884711a7dab398 | |70df5d20c9785 Resolution|--- |FIXED --- Comment #28 from Nikolay Sivov <bunglehead(a)gmail.com> --- Missing name was added with https://source.winehq.org/git/wine.git/commit/af17fcbc1c4ccb354208d1d45dbb03.... Marking fixed. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 Alexandre Julliard <julliard(a)winehq.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #29 from Alexandre Julliard <julliard(a)winehq.org> --- Closing bugs fixed in 5.20. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=39297 Anastasius Focht <focht(a)gmx.net> changed: What |Removed |Added ---------------------------------------------------------------------------- URL|https://www.microsoft.com/t |https://web.archive.org/web |ypography/FontValidator.msp |/20180223132312/http://down |x |load.microsoft.com/download | |/F/E/9/FE9795A3-756E-4F60-8 | |989-03DC9870F189/fontvalset | |up.msi -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
participants (2)
-
wine-bugs@winehq.org -
WineHQ Bugzilla