I'm setting up a new wiki installation and running into some problems with garbage characters showing up due to mismatched character sets. The wiki in question is here: http://wikiausland.de/bookshop/Hauptseite
New articles written in are fine and display in UTF-8 as expected, but the owner has copied over some content, presumably from an old wiki or MS Word, and it seems like it's in ISO-8859-1 and thus showing a heap of question marks for all the umlauts etc… does anyone know how I can go about converting a page from ISO-8859-1 to UTF-8 easily enough?
I've tried setting $wgLegacyEncoding to 'ISO-8859-1' [1] in the hope it might do the conversion for me on article save, but no joy. Are there any other options?
Any tips would be greatly appreciated!
Andru
On Mon, Nov 11, 2013 at 4:17 PM, Andru Vallance andru@tinymighty.com wrote:
I'm setting up a new wiki installation and running into some problems with garbage characters showing up due to mismatched character sets. The wiki in question is here: http://wikiausland.de/bookshop/Hauptseite
New articles written in are fine and display in UTF-8 as expected, but the owner has copied over some content, presumably from an old wiki or MS Word, and it seems like it's in ISO-8859-1 and thus showing a heap of question marks for all the umlauts etc… does anyone know how I can go about converting a page from ISO-8859-1 to UTF-8 easily enough?
I've tried setting $wgLegacyEncoding to 'ISO-8859-1' [1] in the hope it might do the conversion for me on article save, but no joy. Are there any other options?
I guess he copied over into a wiki that was already utf8 and so the row was marked as being utf8 already when saved.
$wgLegacyEncoding should do nothing if the row is already utf8. You could fix this with a bot or possibly by changing the flag in the DB (idk how safe that is...).
But the very first thing you need is a list of pages that need fixing. Maybe that's just as simple as listing that particular user's contribs.
-Jeremy
mediawiki-l@lists.wikimedia.org