Staying so long with ISO 8859 was a mistake.
So I propose converting all Wikipedias that aren't using UTF-8 yet to UTF-8. Procedure should be like that: 1. new LanguageXX.php prepared and put under some name 2. make backups 3. create tables curutf8 and oldutf8 4. disable write access 5. convert all data - numeric HTML codes are going to be replaced by UTF-8 characters too. 6. rename tables cur and old to cur88591 and cur88591 7. rename tables curutf8 and oldutf8 to cur and old 8. replace old LanguageXX.php with utf8-enabled version 9. reenable write access
The conversion script should be tested on test.* Wikipedia first.
During step 5 Wikipedia is going to be read only. It may take some time, especially with English Wikipedia, so it's better to do conversion of each Wikipedia separately. During steps 6-8 Wikipedia may not work at all, but it's going to take less than a minute.
Does anybody have any really good reason why shouldn't I proceed ? These reasons aren't good enough: * broken URLs - all old URLs are going to work after upgrade * size increase - size is going to stay about the same * broken browsers - they should be upgraded, if someone has browser so old that it doesn't grok UTF-8, it's not going to grok CSS, PNGs, and other things we're using either. Unless we want to remove all CSS and PNGs, there's no point in not using UTF-8. * ISO 8859-N is good enough - no, it's not. Not if someone wants to write about people and places from countries where non-8859-1 Latin characters are used, or about linguistics, or math, etc.