Hi!
On 23/06/06, Berto albertoserra@ukr.net wrote:
Perhaps something like a system message which defines a wiki's search order and which letters the wiki will treat as equivalent? And something so that searches for, say, "pago" could turn up results like "págo" and "pàgo" (since the accents would be ignored)?
Yes, this would be the most practical solution to localise sorting orders.
I think you are mixing sorting and searching. Anyway, I don't think this is a good solution to sorting, this would IMHO be 1) a hack solution that would not be enough for many languages (e.g. in Czech, alphabetical sort is a two-stage process, as you can see at http://cs.wikipedia.org/wiki/Abecedn%C3%AD_%C5%99azen%C3%AD#P.C5.99.C3.ADkla...) 2) maybe hard to implement without excessive performance implications
A correct solution is to use the functions already provided by system (and database); MySQL already has some support for collation that could be used, see the discussion at [[bugzilla:164]].
As per accents, so far all we can do is manually input tons of redirects to ensure the search will give a proper result... maybe a bot could do it for us? Such redirects should somehow be marked, though, so that we can trace them and kill them all once the bug is fixed.
I have been thinking about implementing a diacritical filter for the search for a long time. It is quite easy and I have it practically done (for Lucene.NET), only not tested. But there is a question whether Wikimedia servers are not going to switch to another search engine, as has been discussed recently. (Lemmatization would be fine, too, but I know I do not have the required knowledge.)
--[[cs:User:Mormegil | Petr Kadlec]]