On Thu, Apr 2, 2009 at 3:49 AM, Ray Saintonge saintonge@telus.net wrote:
When you declare one version canonical the risk is that you will have supporters of the losing version(s) becoming irrationally angry.
Which version was canonical is an implementation detail that wouldn't even be visible to contributors, so this isn't a big deal. Wikis have to pick a canonical display type right now anyway for anonymous users who haven't specified a preference, right?
On Thu, Apr 2, 2009 at 5:38 AM, Milos Rancic millosh@gmail.com wrote:
Even in the most simplest cases, like Serbian script conversion is, conversion is not transitive (however, intransitivity is small and approximation works good enough).
*That's* what would pose difficulties, yes.
Of course, it is possible to solve it by testing are the surrounding letters are capital or not (as well as it is not a big deal in Serbian). However, this is a very simple case for conversion rules. Usually, it is much cheaper to do conversion at the time of adding/changing text and to keep both versions inside of databases. Because there are two different sets of rules for conversion. The other option is to keep one meta text inside of database, which would have internal markup. So, the previous example may look like "{Latin: {DŽ}AK}".
I suspect this would be feasible to get working to an acceptable level, but only with a lot of effort. Natural languages are really messy. :(