Hi,
If you don't still have this thread, the background is that the Malayam projects want to, and are, using Unicode 5.1 for five characters that have composed code points in 5.1, and decomposed in 5.0. The equivalences are:
CHILLU NN 0D23, 0D4D, 200D 0D7A CHILLU N 0D28, 0D4D, 200D 0D7B CHILLU RR 0D30, 0D4D, 200D 0D7C CHILLU L 0D32, 0D4D, 200D 0D7D CHILLU LL 0D33, 0D4D, 200D 0D7E
Somewhere in the server code, these are "normalized" to 5.1 for the ml projects. Problem:
http://ml.wiktionary.org/w/index.php?title=%E0%B4%95%E0%B5%81%E0%B4%B1%E0%B5...
What you see happening is Interwicket trying to create the language links. It adds the correct link(s), to the 5.0 forms on the other wikts; then on the next scan of the language links tables it removes the links as invalid, as the 5.1 titles don't exist on the other wikts. This then repeats. (;-)
The problem is that it can't write the correct link, as the text normalization "fixes" it.
The other direction isn't a problem, the links are to the 5.0 forms, and when followed are normalized to 5.1 in the title lookup, and the page found.
I'm not (yet) suggesting a particular solution, there are several possibilities (from fairly decent to grotesque hackery ...). But would someone tell me where in the server code this is done? I have not been able to find it. Then I can understand a bit better, possibly just fix it in the bot code somehow, or suggest a fix server-side.
Best Regards, Robert