Hi,
If you don't still have this thread, the background is that the
Malayam projects want to, and are, using Unicode 5.1 for five
characters that have composed code points in 5.1, and decomposed in
5.0. The equivalences are:
CHILLU NN 0D23, 0D4D, 200D 0D7A
CHILLU N 0D28, 0D4D, 200D 0D7B
CHILLU RR 0D30, 0D4D, 200D 0D7C
CHILLU L 0D32, 0D4D, 200D 0D7D
CHILLU LL 0D33, 0D4D, 200D 0D7E
Somewhere in the server code, these are "normalized" to 5.1 for the ml
projects. Problem:
http://ml.wiktionary.org/w/index.php?title=%E0%B4%95%E0%B5%81%E0%B4%B1%E0%B…
What you see happening is Interwicket trying to create the language
links. It adds the correct link(s), to the 5.0 forms on the other
wikts; then on the next scan of the language links tables it removes
the links as invalid, as the 5.1 titles don't exist on the other
wikts. This then repeats. (;-)
The problem is that it can't write the correct link, as the text
normalization "fixes" it.
The other direction isn't a problem, the links are to the 5.0 forms,
and when followed are normalized to 5.1 in the title lookup, and the
page found.
I'm not (yet) suggesting a particular solution, there are several
possibilities (from fairly decent to grotesque hackery ...). But would
someone tell me where in the server code this is done? I have not been
able to find it. Then I can understand a bit better, possibly just fix
it in the bot code somehow, or suggest a fix server-side.
Best Regards,
Robert