Re: [Wiki-research-l] 2012 top pageview list

2 Jan 2013


      The problem (as always) is that there is a difference between pages served
(by the web server) and pages actually wanted and read by the user.
It would be interesting to have referrer statistics. I'm guessing that many
of Wikipedia pages are being referred by Google (and other general search
engines). If so, people may just be clicking through a list of search
results, which causes them to download a WP page but then immediately move
onto the next search result because it isn't what they are looking for. I
rather suspect the prominence of Facebook in the English Wikipedia results
is due to this effect, as I often find myself on the Wikipedia page for
Facebook instead of Facebook itself following a google search. I think the
use of mobile devices (with small screens) probably encourages this sort of
behaviour.
Kerry
-----Original Message-----
From: wiki-research-l-bounces@lists.wikimedia.org
[mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Andrew G.
West
Sent: Sunday, 30 December 2012 2:06 PM
To: wiki-research-l@lists.wikimedia.org
Subject: Re: [Wiki-research-l] 2012 top pageview list
The WMF aggregates them as (page,views) pairs on an hourly basis:
http://dumps.wikimedia.org/other/pagecounts-raw/
I've been parsing these and storing them in a query-able DB format (for 
en.wp exclusively; though the files are available for all projects I 
think) for about two years. If you want to maintain such a fine 
granularity, it can quickly become a terrabyte scale task that eats up a 
lot of processing time.
If your looking for more coarse granularity reports (like top views for 
day, week, month) a lot of efficient aggregation can be done.
See also: http://en.wikipedia.org/wiki/Wikipedia:5000
Thanks, -AW
On 12/28/2012 07:28 PM, John Vandenberg wrote:
...
There is a steady stream of blogs and 'news' about these lists
https://encrypted.google.com/search?client=ubuntu&channel=fs&q=%22Se...
nd%22&ie=utf-8&oe=utf-8#q=wikipedia+top+2012&hl=en&safe=off&client=ubuntu&tb
o=d&channel=fs&tbm=nws&source=lnt&tbs=qdr:w&sa=X&psj=1&ei=GzjeUOPpAsfnrAeQk4
DgCg&ved=0CB4QpwUoAw&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.&bvm=bv.1355534169,d.aW
M&fp=4e60e761ee133369&bpcl=40096503&biw=1024&bih=539
...
How does a researcher go about obtaining access logs with useragents
in order to answer some of these questions?
-- 
Andrew G. West, Doctoral Candidate
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Website: http://www.andrew-g-west.com

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Wiki-research-l] 2012 top pageview list