[Labs-l] Some using a Python framework is relentlessly hammering Harvard sites, resulting an IP range ban.
Maximilian Doerr
maximilian.doerr at gmail.com
Sun Dec 4 17:03:01 UTC 2016
https://phabricator.wikimedia.org/F4978348 <https://phabricator.wikimedia.org/F4978348> Done.
Cyberpower678
English Wikipedia Account Creation Team
ACC Mailing List Moderator
Global User Renamer
> On Dec 4, 2016, at 11:49, Merlijn van Deen (valhallasw) <valhallasw at arctus.nl> wrote:
>
> Hi Maximilian,
>
> https://phabricator.wikimedia.org/file/upload/ <https://phabricator.wikimedia.org/file/upload/> allows you to specify 'Visible to'. You can select 'Custom policy' and select the relevant users, i.e.
> <image.png>
>
> In the meanwhile, I'll try to figure out if I can get some information from netstat.
>
> Cheers,
> Merlijn
>
> On 4 December 2016 at 17:36, Maximilian Doerr <maximilian.doerr at gmail.com <mailto:maximilian.doerr at gmail.com>> wrote:
> Sure, how would I be able to restrict it’s visibility? Harvard is kind enough to unblock, if the culprit is stopped.
>
>
>
> As for exact URLs, it’s the entire domains owned by Harvard. But the access log can provide specifics. The Python script is attempting to get all 140,000 pieces of data about minor planets from www.minorplanetcenter.net <http://www.minorplanetcenter.net/> according to IT, who also claims that such an action the way being done now would severely tie up their servers for quite a while, which they cannot afford.
>
>
>
> Cyberpower678
>
> English Wikipedia Account Creation Team
>
> Mailing List Moderator
>
> Global User Renamer
>
>
>
> From: Merlijn van Deen (valhallasw) [mailto:valhallasw at arctus.nl <mailto:valhallasw at arctus.nl>]
> Sent: Sunday, December 4, 2016 10:59
> To: maximilian.doerr at gmail.com <mailto:maximilian.doerr at gmail.com>
> Subject: Re: [Labs-l] Some using a Python framework is relentlessly hammering Harvard sites, resulting an IP range ban.
>
>
>
> Hi Maximilian,
>
>
>
> On 4 December 2016 at 05:51, Maximilian Doerr <maximilian.doerr at gmail.com <mailto:maximilian.doerr at gmail.com>> wrote:
>
> Would the user who is querying the Harvard sites for planet data, that is carrying the UA “weblinkchecker Pywikibot/3.0-dev (g7171) requests/2.2.1 Python/2.7.6.final.0”, please stop, or severely throttle the GET requests. It’s making 168 requests to that site a minute, and consequently they banned labs from accessing it, according to the IT department there, who kindly shared with me the access log.
>
>
>
>
>
> Would you be able to share the access log with the Tools admins (say, via Phabricator, only shared to Yuvi, Bryan Davis, Andrew Bogott, Chase, scfc and me)? From the combination of external IP and timestamp we may be able to pinpoint which tool was causing this.
>
>
>
> Can you also clarify which exact URLs we are talking about?
>
>
>
> Cheers,
>
> Merlijn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20161204/14752d9a/attachment.html>
More information about the Labs-l
mailing list