Re: [Mediawiki-l] Combating spam

1 May 2007


      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tuesday 01 May 2007 08:07:13 Rob Church wrote:
...
On 01/05/07, Michael Daly mikedaly@magma.ca wrote:
...
What about the use of "noindex" to prevent the indexing of old versions
of pages?  I read about this on (a somewhat out-of-date) chongqed.org
We do this, as far as I'm aware.
...
Apparently, they can use the "recent changes" pages too.  I find that
the search engines are frequently accessing recent changes, but I'm not
sure how to stop that.
Special pages should all be emitting appropriate <meta> tags with
"noindex,nofollow" set, so search engines *oughtn't* to be indexing or
following links from these.
Yeah, but they will still grabber them, causing traffic. And then there are 
the bots who don't obay robots.txt or "noindex, nofollow"...
I came up with this:
User-agent: BecomeBot
    User-agent: gonzo
    User-agent: NPBot
    User-agent: TMCrawler
    Disallow: /
    User-agent: googlebot
    Crawl-delay: 30
    Disallow: /wiki/index.php?title=Special:
    Disallow: /wiki/index.php?title=Internal:
    Disallow: /wiki/index.php?title=MediaWiki:
    ...
    User-agent: *
    Crawl-delay: 120
    Disallow: /wiki/
    ...
forbidding MSN and Yahoo the wiki completely, as the three big search 
engines together caused about 90% of the traffic to my small wiki, going 
through every old page revision (from special recentchanged) etc.
If you have a smaller wiki, teergrubing certain user-agents 
(like "Java", "larbot", "-" etc.) might also make extreme sense. See 
http://bloodgate.com/drowns/example for the effect this has :)
All the best,
Tels
- -- 
 Signed on Tue May  1 10:23:37 2007 with key 0x93B84C15.
 Get one of my photo posters: http://bloodgate.com/posters
 PGP key on http://bloodgate.com/tels.asc or per email.
"A witty saying proves nothing."
-- Voltaire
...PGP SIGNATURE...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iQEVAwUBRjcWencLPEOTuEwVAQI1XAf/QJK2d9/VokTq3v46uyUFrV7CtitnuzpF
vtDOukMDLlCVEMnUYZ0uRK7UENqQhdgpNEeBGBx+TtTXkd2e6IbEWB96lvRR+M2b
KXiujaGap8t851Ash4idF/gt49eSk/hbj1d8757YBL8/10GF2JGlLOfokIraipDQ
Jlx+KJZURF+U0bgNJo7nSPpQLOBsAW35DNRvTYNxsHH79Whh/6Scn6X009yRlqhC
OxdIBRgW+9Y6wcIhdqCkkQ0SQ4G937qZzXfc92G/MFh3ezCR8+Yeuj8aOb1xQZ7X
vtYC7tHDwn4YBZ59hNWPvPTXIopkkoM/coAIw99GcJzufNSJ24KZYg==
=XQiD
-----END PGP SIGNATURE-----

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Re: [Mediawiki-l] Combating spam