Re: [Wikitech-l] Smart machine-learning based anti-spam system (I wish!)

17 Aug 2012


      On Thu, 16 Aug 2012 16:50:27 -0700, Tim Starling tstarling@wikimedia.org  
wrote:
...
On 17/08/12 04:16, Daniel Friesen wrote:
...
Of course. While I have the whole idea for the ui, backend stuff, how
to handle the service, etc... I haven't done the actual
machine-learning stuff before.
I would think that the actual machine learning stuff would be the hard
part. I stopped using Thunderbird's Bayesian spam tagging feature
years ago, when it started sorting emails from smart people in with
the spam. The computer thought that the smart people were using long
words with a similar frequency to the random dictionary words that
padded out the spam messages.
I haven't worked with machine learning either, but I'm guessing it's
not as simple as feeding a pre-tagged data set into a stock Bayesian
filter library.
-- Tim Starling
Yeah, Bayesian is probably too old to use. ClueBot NG appears to be using  
an
Abstract Neural Network [ANN] implementation to do it's spam testing.
 From the documentation [ClueBot NG] it sounds like one of the trickier  
parts
is understanding the WikiText enough to extract the words needed and whanot
out of it.
[ANN] https://en.wikipedia.org/wiki/Artificial_neural_network
[ClueBot NG] https://en.wikipedia.org/wiki/User:ClueBot_NG
-- 
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Smart machine-learning based anti-spam system (I wish!)