Not sure I want to throw the API open to the public (the grok.se folks, and others, have a fine service for casual experimentation).
However, I am willing to share the data with interested researchers who need to do some serious crunching (I have a Java API and could distribute database credentials on a per-case basis).
I'll note that I only parse English Wikipedia at this time. I've found it useful in my anti-vandalism research (i.e., "given that edit survived between time [w] and [x] on article [y], we estimate it received [z] views"). Thanks, -AW
On 04/06/2011 08:44 PM, MZMcBride wrote:
Andrew G. West wrote:
I've parsed every one of these files (at hour granularity; grok.se aggregates at day-level, I believe) since Jan. 2010 into a DB structure indexed by page title. It takes up about 400GB of space, at the moment.
Is your database available to the public? The Toolserver folks have been talking about getting the page view stats into usable form for quite some time, but nothing's happened yet. If you have an API or something similar, that would be fantastic. (stats.grok.se has a rudimentary API that I don't imagine many people are aware of.)
MZMcBride
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l