[Labs-l] tools.wmflabs.org - Mostly gives "Internal Error 500"
Tim Landscheidt
tim at tim-landscheidt.de
Tue Aug 6 17:26:48 UTC 2013
I wrote:
>> Traffic increases quickly (though still a fraction of
>> Toolserver's) and half of it is from geohack, it's normal to
>> see some growth pains. You seem to be right that 500 errors
>> are increasing: according to
>> http://tools.wmflabs.org/awstats/ they were 4.1 % of the
>> "valid" requests in July and they have been 4.9 % so far in
>> August.
>> (The millions of 404 and 403 errors per month are even more
>> mysterious though.)
> Probably related as Geohack seems to trigger accesses to
> /~dispenser/temp/clickheat/js/clickheat.js (in today's log:
> 66639) and /~geohack/siteicon.png (53241). Other common
> misses are /apple-touch-icon.png (752),
> /apple-touch-icon-precomposed.png (1165) and /robots.txt
> (736).
Regarding robots.txt, I've started
https://gerrit.wikimedia.org/r/77916. Toolserver's
robots.txt is:
| User-agent: msnbot
| Disallow: /
| User-agent: *
| Disallow: /~magnus/geo/geohack.php
| Disallow: /~daniel/WikiSense
| Disallow: /~geohack/
| Disallow: /~enwp10/
| Disallow: /~cbm/cgi-bin/
(WikiSense is CatScan IIRC.) Excluding Geohack is probably
a good idea. Do other tool authors have tools they do not
want to be crawled by search engine bots?
Tim
More information about the Labs-l
mailing list