I'm curios what these difficulties are that prevent the calculation of
monthly top viewed pages. After a request from the WMF Communications
team, I generated a list of the 200 most viewed pages from May 2015 to
October 2015 as input for the #Edit2015 video (cf.
https://phabricator.wikimedia.org/T117945 ). IIRC that query did not
take terribly long to complete.

The current top endpoint has breakdowns by project (~800 wikis) and access method (desktop / mobile-web / mobile-app). I think this makes the computation harder as opposed to a global query.
 

On Sat, Jan 2, 2016 at 2:00 PM, Tilman Bayer <tbayer@wikimedia.org> wrote:
On Wed, Dec 16, 2015 at 10:58 AM, Dan Andreescu
<dandreescu@wikimedia.org> wrote:
> Itzik,
>
> The way we're computing top pageviews right now doesn't scale very well, we
> aren't even able to properly do monthly top pages.  So we opened this issue:
> https://phabricator.wikimedia.org/T120113.  When we fix that, it's possible
> we'll be able to get yearly top pages too, but I'm not promising anything :)
I'm curios what these difficulties are that prevent the calculation of
monthly top viewed pages. After a request from the WMF Communications
team, I generated a list of the 200 most viewed pages from May 2015 to
October 2015 as input for the #Edit2015 video (cf.
https://phabricator.wikimedia.org/T117945 ). IIRC that query did not
take terribly long to complete.

>
> As for most edited articles, that can be done with a simple query on each
> database, but it would have really bad performance, like maybe it would
> never terminate.  When we figure out how to compute top pageviews faster
> (bloom filters maybe?) we'll surface the solution and then it'll be pretty
> easy to get top edited.
>
Don't know about bloom filters, but note that one does not need to use
the webrequest data for this - MediaWiki itself stores every edit in a
revision table. Since this thead, AaronH has provided this data (on
the request of the WMF Communications team, as he did last year) at
https://phabricator.wikimedia.org/T122604 .

> On Wed, Dec 16, 2015 at 1:52 PM, Itzik - Wikimedia Israel
> <itzik@wikimedia.org.il> wrote:
>>
>> Hi,
>>
>> I see that the (amazing!) API still can't give us results for the whole
>> 2015. So any way we can get this pages views per project? And also, the most
>> edited articles in 2015 per project?
>>
>> This can be a great PR information for the communication representatives
>> around to world to release to local journalists.
>>
>>
>> Regards,
>> Itzik Edri
>> Chairperson, Wikimedia Israel
>> +972-(0)-54-5878078 | http://www.wikimedia.org.il
>> Imagine a world in which every single human being can freely share in the
>> sum of all knowledge. That's our commitment!
>>
>>
>>



--
Tilman Bayer
Senior Analyst
Wikimedia Foundation
IRC (Freenode): HaeB

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



--
Marcel Ruiz Forns
Analytics Developer
Wikimedia Foundation