Hoi, While it is interesting to know what articles are popular, it serves no real purpose except for curiosity. When the information does show what articles are missed most often, you provide functionality that we do not have and that has a really practical application.
When the people in India, Pakistan and Sri Lanka all at the same time have a bout of cricket fever, it will show in the traffic numbers from these areas. When a Pakistani cricketer is really popular on Wikipedia and he is missing on the Tamil or Singhala Wikipedia it is the type of intelligence that makes for an article that is likely to be really popular.
There is a big need for statistics that point to the articles that are missing and, there are several approaches to such data. In my opinion given the right take on this issue it is definitely a GSOC worthy project. Thanks, GerardM
On 6 April 2011 17:46, Andrew G. West westand@cis.upenn.edu wrote:
See also: http://dammit.lt/wikistats/
I've parsed every one of these files (at hour granularity; grok.se aggregates at day-level, I believe) since Jan. 2010 into a DB structure indexed by page title. It takes up about 400GB of space, at the moment.
While a comprehensive measurement study over this data would be interesting (long term trends, traffic spikes during cultural events, etc.) -- the technical infrastructure is already in place. I doubt a measurement study meets GSoC requirements.
Thanks, -AW
On 04/06/2011 11:05 AM, David Gerard wrote:
On 6 April 2011 16:02, Peng Wanbuaajackson@gmail.com wrote:
This is Peng Wan. I have submitted my application to wikimedia of Gsoc. My Project title is "Figuring out the most popular pages".
Does it do anything http://stats.grok.se/ doesn't?
- d.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Andrew G. West, Doctoral Student Dept. of Computer and Information Science University of Pennsylvania, Philadelphia PA Website: http://www.cis.upenn.edu/~westand
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l