Re: [Wikitech-l] Spam on en:PHP

9 Feb 2005


      Hi,
On Tue, 8 Feb 2005, Rhobite wrote:
...
On Tue, 08 Feb 2005 13:34:48 -0800, Brion Vibber brion@pobox.com wrote:
...
...
But I don't think
CAPTCHA tests are the right approach, due to accessibility issues.
What would you suggest?
Unfortunately I don't have any great suggestions. I've dealt with bot
spam on a much smaller scale on my weblog, and it's not a simple
problem.
How about a lazy Bayesian similarity checker: spam bots tend to write the
same blabla into several articles. So, after each edit, with a certain
(low) probability, check for identical (or similar) words in the last 100
articles, and flag those articles with matches (or matched words) for
potential spam which can then be blocked more efficiently. Of course,
there are words like "is", "are", "the", etc. are probably there, but
there are relatively few words which are common (I think something like
1500 words in English). The list of common words could be built on the
fly.
Just my 1.5 cents,
Dscho

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Spam on en:PHP