Ed Summers, 31/01/2013 12:14:
On Wed, Jan 30, 2013 at 4:20 PM, Jörn Hees wikistats@joernhees.de wrote:
afaik it's the other way around: stats.grok.se aggregates the data from the pagecounts-raw dumps.
Oh right! I never noticed the hostname changing when I clicked on the "data available here" link on stats.grok.se :-) That actually makes me feel a lot better knowing the data collection is happening on a wikimedia server.
The results you list are from the normal wikipedias in the different languages, so i guess wikidata stats are not collected yet. I'd also be very interested in that data to combine it with linked data…
Does anyone know who/what collects the pagecounts-raw dumps on dumps.wikimedia.org?
Technically speaking, it's called webstatscollector. For instance here wikivoyage was added: https://bugzilla.wikimedia.org/show_bug.cgi?id=42055 stats.grok.se aggregates all new data automatically although it doesn't expose it in the interface, you opnly have to know the prefix used in the raw data.
Nemo