On 21/01/12 13:13, Merlijn van Deen wrote:
On the Toolserver side, I would appreciate any comments/work/existing work on the creation of an interwiki graph from the database - there are already tools that suggest images based on interwiki links, so this code should be around - and hopefully be adaptable. The only goal for this process would be to create a list of starting pages interwiki.py can use
- i.e. graphs with one or more missing links, but without any double links.
http://toolserver.org/~platonides/InterwikiPool/InterwikiPool.php shows, for a given page, the other entries in that interwiki pool (as well as a little summary of the differences).
On the Pywikipedia side, some thoughts on running interwiki.py in a new process would be welcome. e.g. how can we improve startup time ('kill all the regexps!') and effectively spawn multiple processes to run. What parameters (throttles?) should be tuned, et cetera.
It doesn't need to be one process per page. The same process could eg. run 10 interwikis instead.