[Labs-l] tools.wmflabs.org - Mostly gives "Internal Error 500"

Tim Landscheidt tim at tim-landscheidt.de
Tue Aug 6 17:26:48 UTC 2013


I wrote:

>> Traffic increases quickly (though still a fraction of
>> Toolserver's) and half of it is from geohack, it's normal to
>> see some growth pains. You seem to be right that 500 errors
>> are increasing: according to
>> http://tools.wmflabs.org/awstats/ they were 4.1 % of the
>> "valid" requests in July and they have been 4.9 % so far in
>> August.
>> (The millions of 404 and 403 errors per month are even more
>> mysterious though.)

> Probably related as Geohack seems to trigger accesses to
> /~dispenser/temp/clickheat/js/clickheat.js (in today's log:
> 66639) and /~geohack/siteicon.png (53241).  Other common
> misses are /apple-touch-icon.png (752),
> /apple-touch-icon-precomposed.png (1165) and /robots.txt
> (736).

Regarding robots.txt, I've started
https://gerrit.wikimedia.org/r/77916.  Toolserver's
robots.txt is:

| User-agent: msnbot
| Disallow: /

| User-agent: *
| Disallow: /~magnus/geo/geohack.php
| Disallow: /~daniel/WikiSense
| Disallow: /~geohack/
| Disallow: /~enwp10/
| Disallow: /~cbm/cgi-bin/

(WikiSense is CatScan IIRC.)  Excluding Geohack is probably
a good idea.  Do other tool authors have tools they do not
want to be crawled by search engine bots?

Tim




More information about the Labs-l mailing list