Re: [Wikitech-l] Smart machine-learning based anti-spam system (I wish!)

17 Aug 2012

On Thu, 16 Aug 2012 16:50:27 -0700, Tim Starling &lt;tstarling(a)wikimedia.org&gt;  
wrote:

...
  On 17/08/12 04:16, Daniel Friesen wrote:
  Of course. While I have the whole idea for the
ui, backend stuff, how
 to handle the service, etc... I haven't done the actual
 machine-learning stuff before. 
 I would think that the actual machine learning stuff would be the hard
 part. I stopped using Thunderbird's Bayesian spam tagging feature
 years ago, when it started sorting emails from smart people in with
 the spam. The computer thought that the smart people were using long
 words with a similar frequency to the random dictionary words that
 padded out the spam messages.

 I haven't worked with machine learning either, but I'm guessing it's
 not as simple as feeding a pre-tagged data set into a stock Bayesian
 filter library.

 -- Tim Starling 
Yeah, Bayesian is probably too old to use. ClueBot NG appears to be using  
an
Abstract Neural Network [ANN] implementation to do it's spam testing.
 From the documentation [ClueBot NG] it sounds like one of the trickier  
parts
is understanding the WikiText enough to extract the words needed and whanot
out of it.

[ANN] https://en.wikipedia.org/wiki/Artificial_neural_network
[ClueBot NG] https://en.wikipedia.org/wiki/User:ClueBot_NG

-- 
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Smart machine-learning based anti-spam system (I wish!)