What techniques would you recommend?Hello Researchers,I've been playing with Recent Changes Stream Interface recently, and have started trying to use the API's "action=compare" to look at every diff of every wiki in real time. The goal is to produce real-time analytics on the content that's being added or deleted. The only problem is that is will really hammer the API with lots of reads since it doesn't have a batch interface. Can I spawn multiple network threads and do 10+ reads per second forever without the API complaining? Can I warn someone about this and get a special exemption for research purposes?
The other thing to do would be to use "action=query" to get the revisions in batches and do the diffing myself, but then i'm not guaranteed to be diffing in the same way that the site is.
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l