On Wed, Dec 16, 2015 at 10:58 AM, Dan Andreescu dandreescu@wikimedia.org wrote:
Itzik,
The way we're computing top pageviews right now doesn't scale very well, we aren't even able to properly do monthly top pages. So we opened this issue: https://phabricator.wikimedia.org/T120113. When we fix that, it's possible we'll be able to get yearly top pages too, but I'm not promising anything :)
I'm curios what these difficulties are that prevent the calculation of monthly top viewed pages. After a request from the WMF Communications team, I generated a list of the 200 most viewed pages from May 2015 to October 2015 as input for the #Edit2015 video (cf. https://phabricator.wikimedia.org/T117945 ). IIRC that query did not take terribly long to complete.
As for most edited articles, that can be done with a simple query on each database, but it would have really bad performance, like maybe it would never terminate. When we figure out how to compute top pageviews faster (bloom filters maybe?) we'll surface the solution and then it'll be pretty easy to get top edited.
Don't know about bloom filters, but note that one does not need to use the webrequest data for this - MediaWiki itself stores every edit in a revision table. Since this thead, AaronH has provided this data (on the request of the WMF Communications team, as he did last year) at https://phabricator.wikimedia.org/T122604 .
On Wed, Dec 16, 2015 at 1:52 PM, Itzik - Wikimedia Israel itzik@wikimedia.org.il wrote:
Hi,
I see that the (amazing!) API still can't give us results for the whole 2015. So any way we can get this pages views per project? And also, the most edited articles in 2015 per project?
This can be a great PR information for the communication representatives around to world to release to local journalists.
Regards, Itzik Edri Chairperson, Wikimedia Israel +972-(0)-54-5878078 | http://www.wikimedia.org.il Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment!