Yes, we connection limit and bandwidth cap. Two or three simultaneous connections tops.
Ariel
Στις 19-06-2015, ημέρα Παρ, και ώρα 13:27 -0400, ο/η Ashok Rao έγραψε:
Dario, thanks much.
Ariel, Kevin – I think the 503 errors were a problem because I was running the grab process on parallel threads. It seems to work fine in serial – the only trouble with this being, of course, the time it takes to download all the available data.
Best, Ashok
On Fri, Jun 19, 2015 at 12:17 PM, Kevin Leduc kevin@wikimedia.org wrote:
- Ariel
Hi Ariel, can you comment on the 503 errors happening sometimes while trying to download data from the dumps?
On Fri, Jun 19, 2015 at 1:01 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Forwarding a note from Ashok Rao (cc’ed), can anyone comment on the dumps server returning 503s?
Ashok – we don’t have yet an in-house API to retrieve pageview data, but the Analytics team is working on one: see this thread. Depending on what you’re doing, http://stats.grok.se/%C2%A0may also come in handy.
Best, Dario
Begin forwarded message:
From: Ashok Rao raoashok@seas.upenn.edu Subject: Wikipedia Page views access Date: June 18, 2015 at 5:53:12 PM GMT+2 To: dario@wikimedia.org
Hi Dario,
Good morning. I'm a student at the University of Pennsylvania and I've been trying to perform a few analyses based on Wikipedia page views data. I've written a script that grabs data from the main dump site – https://dumps.wikimedia.org/other/pagecounts-raw/ – but run into many sporadic 503 errors (sometimes with the download link, other times with the main page itself). I noticed some of this data might be available directly on Wikimedia servers that can be utilized for research purposes.
I was hoping I could get access to this and appreciate your help.
Best, Ashok
-- Ashok M. Rao The Rajendra and Neera Singh Program in Market and Social Systems Engineering School of Engineering and Applied Sciences University of Pennsylvania | Class of '17
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics