http://bugs.winehq.org/show_bug.cgi?id=30992
Bug #: 30992 Summary: msxml3 incorrectly parses Cyrillic text with spaces (needed for Civilization IV) Product: Wine Version: 1.5.7 Platform: x86 OS/Version: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: msxml3 AssignedTo: wine-bugs@winehq.org ReportedBy: sikon@ubuntu.com Classification: Unclassified
In the official Russian version of Civilization IV, all Russian text is displayed without spaces under Wine unless the native msxml3 is installed (using winetricks msxml3).
This seems to be related to the parsing of XML files containing a mix of Russian text encoded as XML entities and spaces.
http://bugs.winehq.org/show_bug.cgi?id=30992
--- Comment #1 from Nikolay Sivov bunglehead@gmail.com 2012-06-24 06:57:43 CDT --- Could you please get a +msxml,+tid log that shows relevant document loading? Also it would be helpful to get some simple document that contains such data.
Is there a Russian demo version by the way?
http://bugs.winehq.org/show_bug.cgi?id=30992
--- Comment #2 from Maia Kozheva sikon@ubuntu.com 2012-06-24 07:00:05 CDT --- Created attachment 40678 --> http://bugs.winehq.org/attachment.cgi?id=40678 Civilization IV with Russian text and no spaces
http://bugs.winehq.org/show_bug.cgi?id=30992
--- Comment #3 from Maia Kozheva sikon@ubuntu.com 2012-06-24 07:00:43 CDT --- Created attachment 40679 --> http://bugs.winehq.org/attachment.cgi?id=40679 An example offending XML file
http://bugs.winehq.org/show_bug.cgi?id=30992
Nikolay Sivov bunglehead@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #40679|text/xml |text/plain mime type| |
http://bugs.winehq.org/show_bug.cgi?id=30992
--- Comment #4 from Maia Kozheva sikon@ubuntu.com 2012-06-24 07:12:34 CDT --- The debug text output is too big to attach (12 MB); you can find it at http://homepc.lucidfox.org/stdout.txt.gz .
http://bugs.winehq.org/show_bug.cgi?id=30992
--- Comment #5 from Maia Kozheva sikon@ubuntu.com 2012-06-24 07:34:54 CDT --- Created attachment 40681 --> http://bugs.winehq.org/attachment.cgi?id=40681 Correct text with spaces (native msxml3)
http://bugs.winehq.org/show_bug.cgi?id=30992
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |minor
http://bugs.winehq.org/show_bug.cgi?id=30992
Nikolay Sivov bunglehead@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever Confirmed|0 |1
--- Comment #6 from Nikolay Sivov bunglehead@gmail.com 2012-06-26 04:00:44 CDT --- I think I know why it works this way, haven't found example file data in log though. If you comment out this block:
--- if (ctxt->node) { /* during domdoc_loadXML() the xmlDocPtr->_private data is not available */ if (!This->properties->preserving && !is_preserving_whitespace(ctxt->node) && strn_isspace(ch, len)) return; } ---
from sax_characters() from msxml3/domdoc.c you will probably get proper results (but be aware that it could stop working entirely cause tree will be different after this change.
The problem is in a way libxml2 reports character references, they are reported in separate sax callback, and when we do this space fixing mess there's we're unaware of history of previous calls. It will work if we make sure next callback is not for text data, practically this means that we should delay text data processing until we get some other callback so we can process text node as a whole and drop it if it's nothing but space chars (depending on that props condition of course).
I'll keep looking at it, just no obviously correct and quick solution comes to mind right now.
And confirming.
http://bugs.winehq.org/show_bug.cgi?id=30992
Alexander Varnin fenixk19@mail.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |fenixk19@mail.ru
--- Comment #7 from Alexander Varnin fenixk19@mail.ru 2012-10-19 19:43:44 CDT --- Confirming still there on 1.5.15
http://bugs.winehq.org/show_bug.cgi?id=30992
Maia Kozheva sikon@ubuntu.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |sikon@ubuntu.com
http://bugs.winehq.org/show_bug.cgi?id=30992
Igor Demyanov igor.demyanov@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |igor.demyanov@gmail.com
--- Comment #8 from Igor Demyanov igor.demyanov@gmail.com 2012-12-26 11:13:47 CST --- Confirming still there on 1.5.19
http://bugs.winehq.org/show_bug.cgi?id=30992
--- Comment #9 from Nikolay Sivov bunglehead@gmail.com 2013-05-05 06:47:01 CDT --- Created attachment 44377 --> http://bugs.winehq.org/attachment.cgi?id=44377 patch
Please try this patch on top of current wine from git, I think it will help.
http://bugs.winehq.org/show_bug.cgi?id=30992
Nikolay Sivov bunglehead@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|msxml3 incorrectly parses |msxml3 incorrectly ignores |Cyrillic text with spaces |whitespaces (needed for |(needed for Civilization |Civilization IV) |IV) |
http://bugs.winehq.org/show_bug.cgi?id=30992
--- Comment #10 from Nikolay Sivov bunglehead@gmail.com 2013-05-07 03:04:02 CDT --- This should be fixed in git with 0403f34b78ef1f663f0703784e089e8396fb338c, please retest.
http://bugs.winehq.org/show_bug.cgi?id=30992
Nikolay Sivov bunglehead@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Fixed by SHA1| |0403f34b78ef1f663f0703784e0 | |89e8396fb338c Status|NEW |RESOLVED Resolution| |FIXED
--- Comment #11 from Nikolay Sivov bunglehead@gmail.com 2013-05-23 20:50:27 CDT --- No reply in a couple of weeks, marking fixed. Reopen if it's still broken the same way.
http://bugs.winehq.org/show_bug.cgi?id=30992
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #12 from Alexandre Julliard julliard@winehq.org 2013-05-24 13:32:59 CDT --- Closing bugs fixed in 1.5.31.