Hi Diderik,
in principle, all of the Wikimedia projects; currently, all listed at
http://stats.grok.se/ which is the top 100 or so Wikipedias. Plus Commons,
if possible.
As for the number of pages on those, that seems to fluctuate (probably just
my scripts breaking on slow/missing data, occasionally); the largest amount
I can find is May 2012, with ~350K pages. But, this only needs to run once
a month. I could even give you the list if you like, and you can extract
the data for me ;-)
But that would certainly not be a long-term, scalable solution. An SQL
interface on the toolserver would be ideal; a speedy http-based API would
be good as well (maybe even better, as it would not require the toolserver
;-), especially if it can take chunks of data (e.g. POST requests with a 1K
article list), so that I don't have to fire thousands of tiny queries.
Cheers,
Magnus
On Mon, Dec 3, 2012 at 7:38 PM, Diederik van Liere
<dvanliere(a)wikimedia.org>wrote;wrote:
Hi Magnus,
Can you the list of pages for the Wikimedia projects that you are
interested in. Once I know how many pages that are I can come up with a
solution.
D
On Mon, Dec 3, 2012 at 2:33 PM, Dario Taraborelli <
dtaraborelli(a)wikimedia.org> wrote:
+1
I have received an increasing number of external requests for something
more efficient than the stats.grok.se JSON interface and more
user-friendly than Domas' hourly raw data. I am also one of the interested
consumers of this data.
Diederik, any chance we could prioritize this request? I guess
per-article and per-project daily / monthly pv would be the most useful
aggregation level.
On Dec 3, 2012, at 11:11 AM, Magnus Manske <magnusmanske(a)googlemail.com>
wrote:
Hi all,
as you might know, I have a few GLAM-related tools on the toolserver.
Some are updated once a month, some can be used live, but all are in high
demand by GLAM institutions.
Now, the monthly updated stats have always been slow to run, but did
almost grind to a halt recently. The on-demand tools have stalled
completely.
All these tools get their data from stats.grok.se, which works well but
not really high-speed; my on-demand tools have apparently been shut out
recently because too many people were using them, DDOSing the server :-(
I know you are working on page view numbers, and for what I gather it's
up-and-running internally already. My requirements are simple: I have a
list of pages on many Wikimedia projects; I need view counts for these
pages for a specific month, per-page.
Now, I know that there is no public API yet, but is there any way I can
get to the data, at least for the monthly stats?
Cheers,
Magnus
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics