Purodha Blissenbach wrote:
While we're at it - in the future, we shall have
interwiki bots reading the
replicated data bases to a great extent while gathering informations about
existing and prseumably missing interwiki links. This will be sparing lots of
request to the wmf servers which will then be bothered only when wiki pages
are actually altered.
Using the replicated data instead of making http (api) requests should speed
up the data collection phase of large inteerwiki groups from several minutes
to a seconds or so.
Another approach of making interwiki bots use the replicated data would
be to pre-process their interwiki data into a list or table of versioned
change requests, being published on the toolserver.
Interwiki worker bots running elsewhere would pick requests from the list &
process them. Picked requests are postponed for a while until replicated
data renders them done, or until a timeout (>replication lag) is exhausted.
Greetings - Purodha
There's a strong argument for rewriting interwiki.py Does anyone know
what an interwiki bot should actually do? Seems worth to determine the
Right Algorithm to be used when automatically resolving interwikis.
MZMcBride wrote:
Do you know the status of getting a solution built in
to MediaWiki (either
in core or in an extension) that could make interwiki.py completely
obsolete? It's my _strong_ recommendation that development effort be put
into a real solution rather than focusing on ways to make interwiki.py suck
less.
MZMcBride
Last discussion
http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/50203
Any work there is likely to wait for merging the interwiki transclusion
branch.