Yeah STiki and more importantly ClueBot NG are what I mean when I say
"outside of Wikimedia (who already have bots for this)".
I looked into them a bit and planned to ask to look at some of the code if
I went along with the project.
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://daniel.friesen.name]
On Thu, 16 Aug 2012 14:59:56 -0700, Chris Steipp
csteipp@wikimedia.org
wrote:
> Hi Daniel,
>
> A lot of your ideas are covered by
>
http://en.wikipedia.org/wiki/Wikipedia:STiki. Andrew has done a lot of
> great research, if you haven't read his papers yet that might be a
> good intro to the type of machine learning approaches that have been
> used.
>
> That being said, I would love to have some system that is constantly
> learning from the edits that are flagged as spam, that we can query
> with new edits from AbuseFilter to get a score of how likely it is
> that this new edit is spam. If you get around to working on your
> system, it would be great to work out some way to interface.
>
>
> On Thu, Aug 16, 2012 at 11:16 AM, Daniel Friesen
>
lists@nadir-seen-fire.com wrote:
>> I've had a good idea for an anti-spam system for awhile.
>> Blocks, Captchas, and local filters, all the tricks we've been using
>> end up
>> not working well enough to easily deal with the spam on a lot of wikis.
>>
>> I know this because I've been continually dealing with the spam on a
>> small
>> dead wiki. Simple AntiSpam, AntiBot, Captchas, TorBlock, Abuse Filter...
>> Time after time I expand my filters more and more. But inevitably a few
>> days
>> later spam not covered by my filters comes through and I have to do it
>> again.
>>
>> I ended up having to deal with it more today and then started writing
>> out
>> the details I've had for awhile on a machine-learning based anti-spam
>> system.
>>
>>
https://www.mediawiki.org/wiki/User:Dantman/Anti-spam_system
>>
>> Of course. While I have the whole idea for the ui, backend stuff, how to
>> handle the service, etc... I haven't done the actual machine-learning
>> stuff
>> before.
>> Also naturally just like Gareth, OAuth, and other things this is just
>> another one of my ideas I don't have the time and resources to do and
>> wish I
>> had the financial backing to work on.
>>
>> --
>> ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [
http://daniel.friesen.name]