See also:
http://dammit.lt/wikistats/
I've parsed every one of these files (at hour granularity; grok.se
aggregates at day-level, I believe) since Jan. 2010 into a DB structure
indexed by page title. It takes up about 400GB of space, at the moment.
While a comprehensive measurement study over this data would be
interesting (long term trends, traffic spikes during cultural events,
etc.) -- the technical infrastructure is already in place. I doubt a
measurement study meets GSoC requirements.
Thanks, -AW
On 04/06/2011 11:05 AM, David Gerard wrote:
On 6 April 2011 16:02, Peng
Wan<buaajackson(a)gmail.com> wrote:
This is Peng Wan. I have submitted my application
to wikimedia of Gsoc.
My Project title is "Figuring out the most popular pages".
Does it do anything
http://stats.grok.se/ doesn't?
- d.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Andrew G. West, Doctoral Student
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Website:
http://www.cis.upenn.edu/~westand