Hello:
When searching for the word cliché in my wiki, the search results return pages that have the word in it but it looks like cliché instead of cliché. This is happening with other words and phrases that use characters like: single quotation mark, double quotation mark, apostrophe, en dash, em dash, accent acute, accent grave, tilde, umlaut, etc.
What is the best way to fix this?
I'm running:
MediaWiki: 1.10.2 (r356) PHP: 5.2.17 (cgi) MySQL: 5.0.90-log
Thanks,
Patricia Barden
On 7/3/2012 6:51 PM, Patricia Barden wrote:
MediaWiki: 1.10.2 (r356)
Is there a reason why you're using such an old MediaWiki? I don't know for sure, but I imagine this particular problem may be fixed by an upgrade.
PHP: 5.2.17 (cgi) MySQL: 5.0.90-log
It looks like you can run 1.19. I've been doing some upgrades from MW 1.11 for a client and they work pretty well. I encourage you to upgrade.
Mark.
Well I still use MediaWiki 1.17.3 and it turns out it doesn't have many problems. It has no problem displaying special characters.
The problem may be with your browser. It is probably not Unicode compliant. Without Unicode, special characters would turn into things like boxes, question marks, weird symbols, or other things.
Or you should upgrade to 1.19. It's the latest version of MediaWiki, released in May 2012.
On Tue, Jul 3, 2012 at 9:21 PM, Mark A. Hershberger mah@everybody.orgwrote:
On 7/3/2012 6:51 PM, Patricia Barden wrote:
MediaWiki: 1.10.2 (r356)
Is there a reason why you're using such an old MediaWiki? I don't know for sure, but I imagine this particular problem may be fixed by an upgrade.
PHP: 5.2.17 (cgi) MySQL: 5.0.90-log
It looks like you can run 1.19. I've been doing some upgrades from MW 1.11 for a client and they work pretty well. I encourage you to upgrade.
Mark.
-- What is normal? Normal is yesterday and last week and last month taken together. -- Snuff, Terry Pratchett
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
On 7/4/2012 2:45 PM, Brandon Pimenta wrote:
Well I still use MediaWiki 1.17.3 and it turns out it doesn't have many problems. It has no problem displaying special characters.
1.17 (June 2011) is a lot more recent than than 1.10 (May 2007). While your version is only a year behind 1.19, the version the original poster is using is 5 years behind. (I would encourage you, too, to upgrade to 1.19 since it is slated for Ubuntu-like "long term support.")
A *lot* of work has been done on support for "special character" support. Looking at the release notes for MW versions after 1.10 for (possibly) relevant fixes that are in 1.17, I came up with the following:
* (bug 16697) Unicode combining characters are difficult to edit in some browsers. * (bug 8445) Multiple-character search terms are now handled properly for Chinese. * (bug 15248) Non-breaking spaces and certain other Unicode space characters are now normalized to ordinary spaces in titles; if your wiki has existing titles with such characters, run cleanupTitles.php and/or cleanupImages.php * (bug 8143) Localised parser function names are now correctly case insensitive if they contain non-ASCII characters * Truncate summary of page moves in revision comment field to avoid broken multibyte characters * Updated Unicode normalization tables * (bug 14952) Page titles are renormalized after html entities are removed so that links with non-NFC character references work correctly. * (bug 3097) Inconsistently usable titles containing HTML character entities are now forbidden. A run of cleanupTitles.php will fix up existing pages.
HTH,
Mark.
On Jul 3, 2012, at 8:21 PM, Mark A. Hershberger wrote:
Hi Mark: Thanks for your suggestion to upgrade. I'm now running: MediaWiki 1.19.1 (r390) PHP 5.3.15 (apache2handler) MySQL 5.1.56-log The upgrade, unfortunately did not fix the broken characters. There weren't a ton so I fixed them manually.
On 7/3/2012 6:51 PM, Patricia Barden wrote:
MediaWiki: 1.10.2 (r356)
Is there a reason why you're using such an old MediaWiki? I don't know for sure, but I imagine this particular problem may be fixed by an upgrade.
PHP: 5.2.17 (cgi) MySQL: 5.0.90-log
It looks like you can run 1.19. I've been doing some upgrades from MW 1.11 for a client and they work pretty well. I encourage you to upgrade.
Mark.
Pat
-- What is normal? Normal is yesterday and last week and last month taken together. -- Snuff, Terry Pratchett
On Tue 04 Sep 2012 12:37:42 PM EDT, Patricia Barden wrote:
Thanks for your suggestion to upgrade.
You're welcome!
I'm now running: MediaWiki 1.19.1 (r390) PHP 5.3.15 (apache2handler) MySQL 5.1.56-log
Excellent. In case you weren't aware 1.19.2 was just released.
The upgrade, unfortunately did not fix the broken characters. There weren't a ton so I fixed them manually.
I noticed a lot of wrongly encoded characters on my client's system for page titles as well that weren't "fixed" by the upgrade.
The client had already created new pages with titles in the correct encoding so I'll just go back and redirect the "bad" page titles.
Human evil is not a problem. It is a mystery. It cannot be solved. -- When Atheism Becomes a Religion, Chris Hedges
On 04/07/12 00:51, Patricia Barden wrote:
Hello:
When searching for the word cliché in my wiki, the search results return pages that have the word in it but it looks like cliché instead of cliché. This is happening with other words and phrases that use characters like: single quotation mark, double quotation mark, apostrophe, en dash, em dash, accent acute, accent grave, tilde, umlaut, etc.
What is the best way to fix this?
So you are viewing Special:Search as if it wasn't in utf-8. If you go to the View menu of your browser, what charset is shown there? Can you change it to utf-8? Another option would be that you have some mixed collate in your tables. But it'd be strange, as I'd expect the problem to appear on all pages, and searchindex table has an encoding step.
mediawiki-l@lists.wikimedia.org