See also: http://dammit.lt/wikistats/
I've parsed every one of these files (at hour granularity; grok.se aggregates at day-level, I believe) since Jan. 2010 into a DB structure indexed by page title. It takes up about 400GB of space, at the moment.
While a comprehensive measurement study over this data would be interesting (long term trends, traffic spikes during cultural events, etc.) -- the technical infrastructure is already in place. I doubt a measurement study meets GSoC requirements.
Thanks, -AW
On 04/06/2011 11:05 AM, David Gerard wrote:
On 6 April 2011 16:02, Peng Wanbuaajackson@gmail.com wrote:
This is Peng Wan. I have submitted my application to wikimedia of Gsoc. My Project title is "Figuring out the most popular pages".
Does it do anything http://stats.grok.se/ doesn't?
- d.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l