[Mediawiki-l] RE: googlebot

Aldon Hynes ahynes1 at optonline.net
Wed Aug 17 01:32:00 UTC 2005


It is worth noting that I've been having some similar problems with
googlebot on a different system (CivicSpace).  It appeared as if Google was
disregarding the robots.txt file.

It appears as if they don't reload the robots.txt file with any sort of
regularity, but you can request them to do so.

Here is the message I got:

--------------------------------------------------------
Once you have added the appropriate robots.txt file entries or meta tags,
you'll need to process your removal request through our public removal
tool. You can access this tool at
http://services.google.com:8882/urlconsole/controller?cmd=reload&lastcmd=log
in

For more information please visit http://www.google.com/remove.html
--------------------------------------------------------

When I went to their service, I was told that even doing that request it may
take them 24 hours before it is processed.

Aldon

-----Original Message-----
Date: Tue, 16 Aug 2005 17:18:38 +0200
From: Thomas Koll <tomk32 at gmx.de>
Subject: Re: [Mediawiki-l] googlebot
To: MediaWiki announcements and site admin list
	<mediawiki-l at Wikimedia.org>
Message-ID: <C08D472C-A409-4D08-89C2-57AD8E347E12 at gmx.de>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed


Am 16.08.2005 um 17:14 schrieb andres:

> Accidentally i watched my logfiles (tail -f) and noticed a weird
> behaviour of the google spider.
> It spends literally days following all links in Spezial:Recentchanges.
>
> It may be a apache configuration mistake (how?), but it also may be
> a mediawiki problems.
>
> How can i disallow search engines from indexing all the recent
> changes?
> They are worthless to index anyway.

you can use http://en.wikipedia.org/robots.txt

ciao, tom




More information about the MediaWiki-l mailing list