Gerrit wrote:
The bot I am thinking of would follow Newpages live. It fetches each page, and checks it against it database. If it's classified as ham, then continue. If it's classified as unsure, ask the user whether it is {{delete}}-material: if yes, train as spam and prepend {{delete}} to the article. If no, train as ham. It could add a comment to the article or a message to the talk page: <!-- classified by ... as ... with score ... -->
I was thinking about Bayesian filters too, but as a means of marking changes in RC for users to deal with them, not as a bot. As for marking diffs, well, my idea would be to mark reductions in article size as such, and to run the filter only on added text.
-- Tim Starling