Although IP blocking isn't perfect, it's probably the most practical. How much
worse would spam be without the current blacklists? Although machine learning is always
an alluring route to take, it is very very very hard to get right and is still easily
tricked. 99% of the mail in my Yahoo! spam folder is not spam. But, if you have a unique
way to better detect spam using machine learning that doesn't require constant review,
then I would gladly pay you for that service.
________________________________
From: Daniel Friesen <daniel(a)nadir-seen-fire.com>
To: mediawiki-l(a)lists.wikimedia.org
Sent: Friday, May 24, 2013 7:09 PM
Subject: Re: [MediaWiki-l] Wiki spam. Stronger fightback.
On Fri, 24 May 2013 13:41:04 -0700, Al Johnson <alj62888(a)yahoo.com> wrote:
Maybe mediawiki sites can unite to keep a global list of these IP's and
block them as soon as they are submitted. Each mediawiki site can
auto-submit a spammer IP as soon as it's discovered to the global list.
What are the problems with this idea?
Al
IP blocking simply doesn't work. It's like playing whack-a-mole against a
billion moles (or trillions on trillions once IPv6 really takes off).
There are too many open proxies, botnet machines, etc... and many of them
are either also addresses used by real editors, NAT addresses with editors
on them, or dynamic IPs that will soon be forced on a non-spammer while
the spammer gets an unblocked IP.
The proper way to deal with this spam is not by IP but by content. We need
some people who are knowledgeable about matching spam by training programs
with spam and non-spam. That's the kind of central database that would be
useful. An extension that sends spam (and after awhile things marked
non-spam) to a central database. A community on that database that vets
valid and invalid submissions. And eventually a mode for that extension
that will start using information generated from that data to start
filtering out spam edits.
I've actually already thought about this and thought about how to make it
friendly to users when their edits accidentally end up considered spam:
https://www.mediawiki.org/wiki/User:Dantman/Anti-spam_system
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://danielfriesen.name/]
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l