[Mediawiki-l] Combating spam

Tels nospam-abuse at bloodgate.com
Tue May 1 10:29:14 UTC 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday 01 May 2007 08:07:13 Rob Church wrote:
> On 01/05/07, Michael Daly <mikedaly at magma.ca> wrote:
> > What about the use of "noindex" to prevent the indexing of old versions
> > of pages?  I read about this on (a somewhat out-of-date) chongqed.org
>
> We do this, as far as I'm aware.
>
> > Apparently, they can use the "recent changes" pages too.  I find that
> > the search engines are frequently accessing recent changes, but I'm not
> > sure how to stop that.
>
> Special pages should all be emitting appropriate <meta> tags with
> "noindex,nofollow" set, so search engines *oughtn't* to be indexing or
> following links from these.

Yeah, but they will still grabber them, causing traffic. And then there are 
the bots who don't obay robots.txt or "noindex, nofollow"...

I came up with this:

	User-agent: BecomeBot
	User-agent: gonzo
	User-agent: NPBot
	User-agent: TMCrawler
	Disallow: /
	User-agent: googlebot
	Crawl-delay: 30
	Disallow: /wiki/index.php?title=Special:
	Disallow: /wiki/index.php?title=Internal:
	Disallow: /wiki/index.php?title=MediaWiki:
	...
	User-agent: *
	Crawl-delay: 120
	Disallow: /wiki/
	...

forbidding MSN and Yahoo the wiki completely, as the three big search 
engines together caused about 90% of the traffic to my small wiki, going 
through every old page revision (from special recentchanged) etc.

If you have a smaller wiki, teergrubing certain user-agents 
(like "Java", "larbot", "-" etc.) might also make extreme sense. See 
http://bloodgate.com/drowns/example for the effect this has :)

All the best,

Tels

- -- 
 Signed on Tue May  1 10:23:37 2007 with key 0x93B84C15.
 Get one of my photo posters: http://bloodgate.com/posters
 PGP key on http://bloodgate.com/tels.asc or per email.

 "A witty saying proves nothing."

  -- Voltaire
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iQEVAwUBRjcWencLPEOTuEwVAQI1XAf/QJK2d9/VokTq3v46uyUFrV7CtitnuzpF
vtDOukMDLlCVEMnUYZ0uRK7UENqQhdgpNEeBGBx+TtTXkd2e6IbEWB96lvRR+M2b
KXiujaGap8t851Ash4idF/gt49eSk/hbj1d8757YBL8/10GF2JGlLOfokIraipDQ
Jlx+KJZURF+U0bgNJo7nSPpQLOBsAW35DNRvTYNxsHH79Whh/6Scn6X009yRlqhC
OxdIBRgW+9Y6wcIhdqCkkQ0SQ4G937qZzXfc92G/MFh3ezCR8+Yeuj8aOb1xQZ7X
vtYC7tHDwn4YBZ59hNWPvPTXIopkkoM/coAIw99GcJzufNSJ24KZYg==
=XQiD
-----END PGP SIGNATURE-----



More information about the MediaWiki-l mailing list