Nicolas Dumazet wrote:
Honestly, the pywikipedia team has a bit changed these last months, and the API edit will soon be available : I've been telling myself for days that interwiki.py will need sooner or later a rewrite. But this is not this easy.
I understand your concept of "interwiki class", but finding such a class does not appear to be this obvious.
If you have a general pseudo-algorithm being able to outline a specific class of articles on the same subject, please share it. But I think that the actual behavior -- starting from a specific page, building the interwikik links graph, and indexing the cycles -- if not optimal, can not be avoided this easily.
No, it can't be avoided, but Purodha is right in that using the toolserver dbs would be faster. Now, i don't know how is interwiki.py structured, buy i think it claims for different pluggable modules for whatever is doing get_interwikis_from_page() So you could have one acting as it's now, another obtaining the data via the API, and yet another one directly querying the langlinks table at the toolserver.
Directly querying the langlinks table not only saves time querying the wiki, but allows for querying interwikis for only those wikis you're writing to. This also opens the ability of completely changing the source wiki concept, and going instead querying each wiki db for links to a target wiki.