http://bugs.winehq.org/show_bug.cgi?id=16514
Summary: broken encoding or database conversion regarding Umlaut Product: WineHQ Apps Database Version: unspecified Platform: PC OS/Version: Linux Status: UNCONFIRMED Severity: enhancement Priority: P2 Component: appdb-unknown AssignedTo: wine-bugs@winehq.org ReportedBy: hoehle@users.sourceforge.net
Since the recent restyling of winehq.org, Umlauts such as "äüöÄÖÜ" are not displayed correctly anymore in AppDB. See e.g. "Die Völker 2/The Nations" http://appdb.winehq.org/objectManager.php?sClass=version&iId=11470
Instead, a "?" is displayed (typical for unrecognized UTF-8 characters in some parsers).
This affects: - my login name, in the initial greetings screen - Category: Main > Games > Strategy Games > Die V�lker 2 > 2.02 - Name Die V�lker 2
OTOH, in bugzilla, comments of mine are correctly displayed with Umlauts.
I'm using firefox 2.0.x or 3.x.y
Is it just a matter of web server configuration, or was the application database incorrectly converted or backed-up at some point in time?
Regards, Jörg Höhle
http://bugs.winehq.org/show_bug.cgi?id=16514
Rosanne DiMesio dimesio@earthlink.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dimesio@earthlink.net
--- Comment #1 from Rosanne DiMesio dimesio@earthlink.net 2008-12-15 19:04:25 --- It's not just umlauts, it's a variety of special characters, including quotation marks (which I suspect were probably "curly" quotes copied from an MS Word document), dashes, and trademark symbols. Manually correcting them works, and I've already fixed a few. Unless someone tells me there's an easier way to fix it, I plan on doing it app by app.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #2 from André H. nerv@dawncrow.de 2008-12-21 06:30:48 --- Created an attachment (id=18105) --> (http://bugs.winehq.org/attachment.cgi?id=18105) Username Codepage Problem
But what is with Usernames? Like mine? It should be André H. and not like in the attachment
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #3 from Rosanne DiMesio dimesio@earthlink.net 2008-12-21 09:46:24 --- (In reply to comment #2)
Created an attachment (id=18105)
--> (http://bugs.winehq.org/attachment.cgi?id=18105) [details]
Username Codepage Problem
But what is with Usernames? Like mine? It should be André H. and not like in the attachment
The problem affects any characters in any part of the AppDB that are not UTF-8. You should be able to edit your profile to fix the problem with your name. Users are going to have to do that anyway--I'm not going to try to guess the correct spelling of people's names.
At the moment, I'm about halfway through editing the application names and descriptions.
This is going to take some time, and I'm not going to be able to fix everything. I can't edit comments at all, even as an admin.
http://bugs.winehq.org/show_bug.cgi?id=16514
Thomas Beimel gnulinux@thomas-beimel.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |gnulinux@thomas-beimel.de
--- Comment #4 from Thomas Beimel gnulinux@thomas-beimel.de 2008-12-23 05:25:03 --- Noticed the problem too and changed the name of a application where I am maintainer. If any help is needed with further renaming I can be a volunteer.
http://bugs.winehq.org/show_bug.cgi?id=16514
nathan.n saturn_systems@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |saturn_systems@yahoo.com
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #5 from Jeff Zaroyko jeffz@jeffz.name 2008-12-31 17:14:18 --- It's probably because the content type of all pages being served by the AppDB have been changed to utf-8. Someone will need to write a patch to change it back to ISO-8859-1.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #6 from Rosanne DiMesio dimesio@earthlink.net 2008-12-31 20:22:25 --- (In reply to comment #5)
It's probably because the content type of all pages being served by the AppDB have been changed to utf-8. Someone will need to write a patch to change it back to ISO-8859-1.
Are you sure the change wasn't intentional?
I asked about this on wine-devel before I started fixing things manually, but never got a clear response, or any real interest in the problem.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #7 from Jeff Zaroyko jeffz@jeffz.name 2008-12-31 23:14:52 --- (In reply to comment #6)
(In reply to comment #5)
It's probably because the content type of all pages being served by the AppDB have been changed to utf-8. Someone will need to write a patch to change it back to ISO-8859-1.
Are you sure the change wasn't intentional?
I asked about this on wine-devel before I started fixing things manually, but never got a clear response, or any real interest in the problem.
No, I don't see how it could be intentional. Probably just copy/pasting from the main site design.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #8 from Austin English austinenglish@gmail.com 2009-01-01 17:35:38 --- (In reply to comment #5)
It's probably because the content type of all pages being served by the AppDB have been changed to utf-8. Someone will need to write a patch to change it back to ISO-8859-1.
You'll break anything already changed to UTF-8.
It may be easier to update the database itself to UTF-8...
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #9 from nathan.n saturn_systems@yahoo.com 2009-01-01 17:40:10 --- (In reply to comment #8)
You'll break anything already changed to UTF-8.
It may be easier to update the database itself to UTF-8...
Yes I agree please leave the the database in UTF-8..
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #10 from Rosanne DiMesio dimesio@earthlink.net 2009-01-01 18:30:35 --- IMO, it's better to keep the AppDB consistent with the rest of the website, if only to avoid this problem recurring the next time someone does a redesign.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #11 from Jörg Höhle hoehle@users.sourceforge.net 2009-01-30 05:45:56 --- How to vote for this bug? I feel this issue should have high priority because there's a risk that the DB is filled with mixed UTF-8 and ISO8859-1 encodings. Should that happen it will be very hard to recover it to a sane state. OTOH, perhaps somebody with low-level access can jump in and say that the DB is fine and totally consistent, and just the web interface suffers from broken encodings? Then the bug is not that severe.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #12 from Jörg Höhle hoehle@users.sourceforge.net 2009-01-30 05:50:56 --- This is a note that I felt is not worth an own bug number: The broken umlaut conversion also affects the e-mails that we receive when submitting - application entries - test versions - screenshots
E.g. "The screenshot you submitted for Max & Mario (4) Dr. DA1/4sters Schatten 1.0 German / Deutsch has been accepted." It should have said "Dr. Düsters Schatten" (and appears correctly in the appdb web page).
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #13 from Rosanne DiMesio dimesio@earthlink.net 2009-01-30 12:44:04 --- (In reply to comment #11)
How to vote for this bug? I feel this issue should have high priority because there's a risk that the DB is filled with mixed UTF-8 and ISO8859-1 encodings.
As I understand it, that's precisely the problem. Anything entered in the AppDB since the switch is UTF-8. Older entries are littered with ISO-8859-1 characters. Switching the website codepage back to ISO-8859-1 would fix the display of ISO-8859-1 characters, but cause the UTF-8 characters to display incorrectly. Either way, entries are going to have to be fixed manually. Since the rest of the website is UTF-8, it makes more sense to me to change the ISO-8859-1 characters to UTF-8 than the other way around. Feel free to correct me if I'm mistaken.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #14 from Jörg Höhle hoehle@users.sourceforge.net 2009-03-30 06:31:45 --- Here's a perhaps different symptom. AppDB mis-displays an entry which did not exist prior to the ISO-8859-1 -> UTF-8 damage. So it should be UTF-8 all along. Nevertheless, "?" is displayed instead of an Umlaut.
http://appdb.winehq.org/objectManager.php?sClass=version&iId=12468 says: known bugs: "The Alien Nations / Die V�lker crashes: ..." The particularity may be that this text comes from bugzilla, not appdb.
http://bugs.winehq.org/show_bug.cgi?id=16514
Ken Sharp kennybobs@o2.co.uk changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Severity|enhancement |normal
--- Comment #15 from Ken Sharp kennybobs@o2.co.uk 2009-05-27 18:56:16 --- Confirming.
http://bugs.winehq.org/show_bug.cgi?id=16514
Alex Balut alexandru.balut@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |alexandru.balut@gmail.com
--- Comment #16 from Alex Balut alexandru.balut@gmail.com 2010-04-17 16:30:26 --- Regarding the displaying of bugs..
The AppDB page for "uTorrent 2.0.x" contains a link to bug 21156.
The Bugzilla page, character shows as a "micro" character (good): http://bugs.winehq.org/show_bug.cgi?id=21156 Content-Type: text/html; charset=UTF-8
The AppDB page, character shows up as an "unknown" character (bad): http://appdb.winehq.org/objectManager.php?sClass=version&iId=17511 Content-Type: text/html
Would it be a problem if the Content-Type of the AppDB page is changed to be the same as the one for the Bugzilla page?
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #17 from Jörg Höhle hoehle@users.sourceforge.net 2010-04-23 07:06:31 --- What need be done is IMHO 1. See what encoding the database(s) use. There's no need mucking with encodings on the web side of things if the DB contains junk (that is a mixture of ISO-8859-1 and UTF-8). Who can tell what it is? 2. See & control what encodings are used between applets and DB (connection level). 3. Control what the web applets spit out to the browsers.
Nowadays the answer is obviously UTF-8. It's equally obvious that the Wine databases did not start that way.
Last but not least, it's not the web server's choice to impose an encoding on the browser. It ought to listen to the client's advertised capabilities (Accept-*) and *should* convert on the fly if e.g. the browser indicates that it only accepts ISO-8859-1. Few servers actually do this.
4. Tell the SMTP component about the correct encoding so it can send out proper MIME formatted e-mails (quoted printable when needed etc.)
Or, to respond to your question directly:
Would it be a problem if the Content-Type of the AppDB page is changed to be the same as the one for the Bugzilla page?
First you need to make sure that the AppDB applets actually receives the "micro" character from the DB. Only afterwards can you convert its encoding to what the rest of the AppDB page uses.
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #18 from Alex Balut alexandru.balut@gmail.com 2010-04-24 04:55:42 --- I was able to reproduce the bug on my machine: - Installed appdb, and bugzilla) - I tried äüöÄÖܵ in: user names, application name, version name, app description, version description, developer. They seem to appear fine. - I added a bug in bugzilla, containing the same characters in the bug title. It seems to appear fine in bugzilla. - I linked it with a version of an app. Then I noticed (in the page of that app version) these seven characters in the title of the bug appear as "�" (unknown character, or whatever).
I included this in the header of objectManager.php, but nothing changes: Content-type: text/html; charset=utf-8
I'll check the encoding of the data from the bugs table in the bugs database..
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #19 from Alex Balut alexandru.balut@gmail.com 2010-04-24 04:59:01 --- Created an attachment (id=27533) --> (http://bugs.winehq.org/attachment.cgi?id=27533) mysqldump bugs --tables bugs > bugs.sql
http://bugs.winehq.org/show_bug.cgi?id=16514
--- Comment #20 from Alex Balut alexandru.balut@gmail.com 2010-04-25 08:50:00 --- Submitted patch for fixing the displaying of bug summaries. http://www.winehq.org/pipermail/wine-patches/2010-April/087761.html
$ mysql bugs -e "show variables like 'character_set_%'" +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+
I changed the character_set_results MySQL variable (http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_c...) to utf8, and the short_description of the bug appears fine when displaying an app version linked to that bug.
I looked for users and apps with names that contain characters which do not appear fine, but could not find any. Can anyone provide an example?
http://bugs.winehq.org/show_bug.cgi?id=16514
Danila Sentiabov dsent.zen@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dsent.zen@gmail.com
http://bugs.winehq.org/show_bug.cgi?id=16514
Danila Sentiabov dsent.zen@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC|dsent.zen@gmail.com |
http://bugs.winehq.org/show_bug.cgi?id=16514
André H. nerv@dawncrow.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Fixed by SHA1| |7a70a03b567daf44d7b16fa953d | |4659f7fc57a25 Status|NEW |RESOLVED CC| |nerv@dawncrow.de Resolution| |FIXED
--- Comment #21 from André H. nerv@dawncrow.de 2012-06-19 14:14:40 CDT --- fixed
http://bugs.winehq.org/show_bug.cgi?id=16514
André H. nerv@dawncrow.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #22 from André H. nerv@dawncrow.de 2012-06-19 14:14:53 CDT --- closing