I'm curios what these difficulties are that prevent the calculation of monthly top viewed pages. After a request from the WMF Communications team, I generated a list of the 200 most viewed pages from May 2015 to October 2015 as input for the #Edit2015 video (cf. https://phabricator.wikimedia.org/T117945 ). IIRC that query did not take terribly long to complete.
The current top endpoint has breakdowns by project (~800 wikis) and access method (desktop / mobile-web / mobile-app). I think this makes the computation harder as opposed to a global query.
On Sat, Jan 2, 2016 at 2:00 PM, Tilman Bayer tbayer@wikimedia.org wrote:
On Wed, Dec 16, 2015 at 10:58 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Itzik,
The way we're computing top pageviews right now doesn't scale very well,
we
aren't even able to properly do monthly top pages. So we opened this
issue:
https://phabricator.wikimedia.org/T120113. When we fix that, it's
possible
we'll be able to get yearly top pages too, but I'm not promising
anything :) I'm curios what these difficulties are that prevent the calculation of monthly top viewed pages. After a request from the WMF Communications team, I generated a list of the 200 most viewed pages from May 2015 to October 2015 as input for the #Edit2015 video (cf. https://phabricator.wikimedia.org/T117945 ). IIRC that query did not take terribly long to complete.
As for most edited articles, that can be done with a simple query on each database, but it would have really bad performance, like maybe it would never terminate. When we figure out how to compute top pageviews faster (bloom filters maybe?) we'll surface the solution and then it'll be
pretty
easy to get top edited.
Don't know about bloom filters, but note that one does not need to use the webrequest data for this - MediaWiki itself stores every edit in a revision table. Since this thead, AaronH has provided this data (on the request of the WMF Communications team, as he did last year) at https://phabricator.wikimedia.org/T122604 .
On Wed, Dec 16, 2015 at 1:52 PM, Itzik - Wikimedia Israel itzik@wikimedia.org.il wrote:
Hi,
I see that the (amazing!) API still can't give us results for the whole 2015. So any way we can get this pages views per project? And also, the
most
edited articles in 2015 per project?
This can be a great PR information for the communication representatives around to world to release to local journalists.
Regards, Itzik Edri Chairperson, Wikimedia Israel +972-(0)-54-5878078 | http://www.wikimedia.org.il Imagine a world in which every single human being can freely share in
the
sum of all knowledge. That's our commitment!
-- Tilman Bayer Senior Analyst Wikimedia Foundation IRC (Freenode): HaeB
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics