Dan, guys, me again.
I crosschecked the numbers from for example
pagecounts-2017-02-views-ge-5-totals.bz2
with the tools at tools.wmflabs.org (here for page "Falco").
It seems, that the dump only has "Desktop" numbers, not "Mobile Web" and "Mobile App" when it comes to the platform.
Is that correct ?
Is there a way to get a sum over all three platforms ?
Thanx, Cheers, JJ
Am 06.03.2017 um 17:38 schrieb Jörg Jung:
Ok, guys, thanx alot !
Am 06.03.2017 um 17:33 schrieb Dan Andreescu:
Jorg, the project abbreviations are explained in depth here: https://wikitech.wikimedia.org/wiki/Analytics/Data/Pageviews
On Mon, Mar 6, 2017 at 11:15 AM, Jörg Jung <joerg.jung@retevastum.de mailto:joerg.jung@retevastum.de> wrote:
Yeah, Dan, that will work, thanx. Just out of curiosity: Why are there three projects for "de" and what is the difference between them ? /de/,/de.m/ and /de.zero/ Cheers, JJ Am 06.03.2017 um 15:45 schrieb Dan Andreescu: > Jorg, take a look at https://dumps.wikimedia.org/other/pagecounts-ez/ <https://dumps.wikimedia.org/other/pagecounts-ez/> > which has compressed data without losing granularity. You can get > monthly files here and download a lot less data. > > On Mon, Mar 6, 2017 at 5:40 AM, Jörg Jung <joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de> > <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>>> wrote: > > Marcel, > > thanx for ur quick answer. > My main issue with dumps (or i don't get something) is: > > I need to download them first to be able to aggregate and filter. > Which for the year 2016 would be: 40MB(middle) * 24h * 30d * 12m = about > 350TB > > As i am not sitting directly at DE-CIX but in my private office i will > face a pretty hard time with that :-) > > So my idea is that somebody "closer" to the raw data would basically do > the aggregation and filtering for me... > > Will somebody (please) ? > > Thanx, JJ > > Am 06.03.2017 um 11:14 schrieb Marcel Ruiz Forns: > > Hi Jörg, :] > > > > Do you mean top 250K most viewed *articles* in de.wikipedia.org <http://de.wikipedia.org> > <http://de.wikipedia.org> > > <http://de.wikipedia.org>? > > > > If so, I think you can get that from the dumps indeed. You can find 2016 > > hourly pageview stats by article for all wikis > > here: https://dumps.wikimedia.org/other/pageviews/2016/ <https://dumps.wikimedia.org/other/pageviews/2016/> > <https://dumps.wikimedia.org/other/pageviews/2016/ <https://dumps.wikimedia.org/other/pageviews/2016/>> > > > > Note that the wiki codes (first column) you're interested in are: > /de/, > > /de.m/ and /de.zero/. > > The third column holds the number of pageviews you're after. > > Also, this data set does not include bot traffic as recognized by the > > pageview definition > <https://meta.wikimedia.org/wiki/Research:Page_view <https://meta.wikimedia.org/wiki/Research:Page_view> > <https://meta.wikimedia.org/wiki/Research:Page_view <https://meta.wikimedia.org/wiki/Research:Page_view>>>. > > As files are hourly and contain data for all wikis, you'll need some > > aggregation and filtering. > > > > Cheers! > > > > On Mon, Mar 6, 2017 at 2:59 AM, Jörg Jung <joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de> <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>> > > <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de> <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>>>> wrote: > > > > Ladies, gents, > > > > for a project i plan i'd need the following data: > > > > Top 250K sites for 2016 in project de.wikipedia.org <http://de.wikipedia.org> <http://de.wikipedia.org> > > <http://de.wikipedia.org>, user-access. > > > > I only need the name of the site and the corrsponding number of > > user-accesses (all channels) for 2016 (sum over the year). > > > > As far as i can see i can't get that data via REST or by aggegating > > dumps. > > > > So i'd like to ask here, if someone likes to helpout. > > > > Thanx, cheers, JJ > > > > -- > > Jörg Jung, Dipl. Inf. (FH) > > Hasendriesch 2 > > D-53639 Königswinter > > E-Mail: joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de> > <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>> <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de> > <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>>> > > Web: www.retevastum.de <http://www.retevastum.de> <http://www.retevastum.de> > <http://www.retevastum.de> > > www.datengraphie.de <http://www.datengraphie.de> <http://www.datengraphie.de> > <http://www.datengraphie.de> > > www.digitaletat.de <http://www.digitaletat.de> <http://www.digitaletat.de> > <http://www.digitaletat.de> > > www.olfaktum.de <http://www.olfaktum.de> <http://www.olfaktum.de> > <http://www.olfaktum.de> > > > > _______________________________________________ > > Analytics mailing list > > Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org> > <mailto:Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>> > <mailto:Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org> > <mailto:Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>>> > > https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics> > <https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics>> > > <https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics> > <https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics>>> > > > > > > > > > > -- > > *Marcel Ruiz Forns* > > Analytics Developer > > Wikimedia Foundation > > > > > > _______________________________________________ > > Analytics mailing list > > Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org> <mailto:Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>> > > https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics> > <https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics>> > > > > -- > Jörg Jung, Dipl. Inf. (FH) > Hasendriesch 2 > D-53639 Königswinter > E-Mail: joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de> <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>> > Web: www.retevastum.de <http://www.retevastum.de> <http://www.retevastum.de> > www.datengraphie.de <http://www.datengraphie.de> <http://www.datengraphie.de> > www.digitaletat.de <http://www.digitaletat.de> <http://www.digitaletat.de> > www.olfaktum.de <http://www.olfaktum.de> <http://www.olfaktum.de> > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org> <mailto:Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>> > https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics> > <https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics>> > > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics> > -- Jörg Jung, Dipl. Inf. (FH) Hasendriesch 2 D-53639 Königswinter E-Mail: joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de> Web: www.retevastum.de <http://www.retevastum.de> www.datengraphie.de <http://www.datengraphie.de> www.digitaletat.de <http://www.digitaletat.de> www.olfaktum.de <http://www.olfaktum.de> _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/analytics <https://lists.wikimedia.org/mailman/listinfo/analytics>
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics