* Sumana Harihareswara wrote:
Vinay from the Internet Archive asked me, with reference to http://meta.wikimedia.org/wiki/Help:Recent_changes :
Is there someone I can contact regarding parsing out the URLs from the stream of recent changes? The idea being to grab the text of the recent change and extract out anything that looks like a URL and feed it into a queue at IA's end for archiving.
Note that http://www.mediawiki.org/wiki/API:Properties#extlinks_.2F_el there is an API to get all the external links on a given page and there may well be bots that monitor new external links already as anti-spam measure. Also, the IRC feeds are probably a better way to keep track of recent changes.