2008/1/28, Anders Wegge Jakobsen wegge@wegge.dk:
Consider a random article about the 1190s, like [[da:1190'erne]]. Follow the interwiki links from that one. Follow te interwiki links from the newly found articles. Wonder how you ended up back af dawiki, but this time at the year 1190.
There is something rotten, not just in the state of denmark, but in a lot of the wikipedia articles about years and decades. I've made a simplified interwiki graph, http://wegge.dk/interwiki-1190.png. Be warned that it's a 13522 x 309 png image. The graph clearly shows that a lot of wikipedias have a path from their respective decade article, via [[he:1190]], back to the same wikipedias year article. This is not the only decade showing the problem, so it's not easy to fix. Given the the large number of wikipedias involved, it's too large a task to perform by hand, and I'm also afraid that before one such loop has been fixed, one or more iw bots will start spreading the problem again.
So does anyone have an idea about how to solve this mess by bot?
Just do
python interwiki.py 1190'erne -ignore:he:1190 -force
with a bot that is registered at all languages that have a page on the decade (or, if it is not, do the remaining ones by hand). It will remove the incorrect link, and get the thing working correctly again.
As for your fear that "before one such loop has been fixed, one or more iw bots will start spreading the problem again." - this will not happen unless their operators are doing a really bad job. A bot working on the 1190s will find that there are languages for which it gets two links - one for 1190s and one for 1190. In such a case, it will not out of itself make a decision or in fact make any changes at all. Instead, * if the bot is running autonomously, it will skip the page * if the bot is running interactively, it will ask the operator which pages to include and which ones not
The only bot who will re-create the mess is an interactive bot in which the operator makes the wrong choice as to what to include. Bots do have a risk of copying mistakes, but once any loop to a different page in the same language has been found, the bots will stop and just ignore the pages involved.