Jorg, the project abbreviations are explained in depth here: https://wikitech.wikimedia.org/wiki/Analytics/Data/Pageviews

On Mon, Mar 6, 2017 at 11:15 AM, Jörg Jung <joerg.jung@retevastum.de> wrote:
Yeah, Dan, that will work, thanx.

Just out of curiosity: Why are there three projects for "de" and what is
the difference between them ?  /de/,/de.m/ and /de.zero/

Cheers, JJ

Am 06.03.2017 um 15:45 schrieb Dan Andreescu:
> Jorg, take a look at https://dumps.wikimedia.org/other/pagecounts-ez/
> which has compressed data without losing granularity.  You can get
> monthly files here and download a lot less data.
>
> On Mon, Mar 6, 2017 at 5:40 AM, Jörg Jung <joerg.jung@retevastum.de
> <mailto:joerg.jung@retevastum.de>> wrote:
>
>     Marcel,
>
>     thanx for ur quick answer.
>     My main issue with dumps (or i don't get something) is:
>
>     I need to download them first to be able to aggregate and filter.
>     Which for the year 2016 would be: 40MB(middle) * 24h * 30d * 12m = about
>     350TB
>
>     As i am not sitting directly at DE-CIX but in my private office i will
>     face a pretty hard time with that :-)
>
>     So my idea is that somebody "closer" to the raw data would basically do
>     the aggregation and filtering for me...
>
>     Will somebody (please) ?
>
>     Thanx, JJ
>
>     Am 06.03.2017 um 11:14 schrieb Marcel Ruiz Forns:
>     > Hi Jörg, :]
>     >
>     > Do you mean top 250K most viewed *articles* in de.wikipedia.org
>     <http://de.wikipedia.org>
>     > <http://de.wikipedia.org>?
>     >
>     > If so, I think you can get that from the dumps indeed. You can find 2016
>     > hourly pageview stats by article for all wikis
>     > here: https://dumps.wikimedia.org/other/pageviews/2016/
>     <https://dumps.wikimedia.org/other/pageviews/2016/>
>     >
>     > Note that the wiki codes (first column) you're interested in are:
>     /de/,
>     > /de.m/ and /de.zero/.
>     > The third column holds the number of pageviews you're after.
>     > Also, this data set does not include bot traffic as recognized by the
>     > pageview definition
>     <https://meta.wikimedia.org/wiki/Research:Page_view
>     <https://meta.wikimedia.org/wiki/Research:Page_view>>.
>     > As files are hourly and contain data for all wikis, you'll need some
>     > aggregation and filtering.
>     >
>     > Cheers!
>     >
>     > On Mon, Mar 6, 2017 at 2:59 AM, Jörg Jung <joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>
>     > <mailto:joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>>> wrote:
>     >
>     >     Ladies, gents,
>     >
>     >     for a project i plan i'd need the following data:
>     >
>     >     Top 250K sites for 2016 in project de.wikipedia.org <http://de.wikipedia.org>
>     >     <http://de.wikipedia.org>, user-access.
>     >
>     >     I only need the name of the site and the corrsponding number of
>     >     user-accesses (all channels) for 2016 (sum over the year).
>     >
>     >     As far as i can see i can't get that data via REST or by aggegating
>     >     dumps.
>     >
>     >     So i'd like to ask here, if someone likes to helpout.
>     >
>     >     Thanx, cheers, JJ
>     >
>     >     --
>     >     Jörg Jung, Dipl. Inf. (FH)
>     >     Hasendriesch 2
>     >     D-53639 Königswinter
>     >     E-Mail:     joerg.jung@retevastum.de
>     <mailto:joerg.jung@retevastum.de> <mailto:joerg.jung@retevastum.de
>     <mailto:joerg.jung@retevastum.de>>
>     >     Web:        www.retevastum.de <http://www.retevastum.de>
>     <http://www.retevastum.de>
>     >                 www.datengraphie.de <http://www.datengraphie.de>
>     <http://www.datengraphie.de>
>     >                 www.digitaletat.de <http://www.digitaletat.de>
>     <http://www.digitaletat.de>
>     >                 www.olfaktum.de <http://www.olfaktum.de>
>     <http://www.olfaktum.de>
>     >
>     >     _______________________________________________
>     >     Analytics mailing list
>     >     Analytics@lists.wikimedia.org
>     <mailto:Analytics@lists.wikimedia.org>
>     <mailto:Analytics@lists.wikimedia.org
>     <mailto:Analytics@lists.wikimedia.org>>
>     >     https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>     >     <https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>>
>     >
>     >
>     >
>     >
>     > --
>     > *Marcel Ruiz Forns*
>     > Analytics Developer
>     > Wikimedia Foundation
>     >
>     >
>     > _______________________________________________
>     > Analytics mailing list
>     > Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
>     > https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>     >
>
>     --
>     Jörg Jung, Dipl. Inf. (FH)
>     Hasendriesch 2
>     D-53639 Königswinter
>     E-Mail:     joerg.jung@retevastum.de <mailto:joerg.jung@retevastum.de>
>     Web:        www.retevastum.de <http://www.retevastum.de>
>                 www.datengraphie.de <http://www.datengraphie.de>
>                 www.digitaletat.de <http://www.digitaletat.de>
>                 www.olfaktum.de <http://www.olfaktum.de>
>
>     _______________________________________________
>     Analytics mailing list
>     Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
>     https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>

--
Jörg Jung, Dipl. Inf. (FH)
Hasendriesch 2
D-53639 Königswinter
E-Mail:     joerg.jung@retevastum.de
Web:        www.retevastum.de
            www.datengraphie.de
            www.digitaletat.de
            www.olfaktum.de

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics