[WikiEN-l] Re: Experiment on new pages
Kat Walsh
mindspillage at gmail.com
Tue Dec 6 03:23:38 UTC 2005
On 12/5/05, Fastfission <fastfission at gmail.com> wrote:
> On a similar topic -- are there any tool for automatic monitoring new
> page creation?
>
> I was thinking about this in the shower today, and it didn't seem that
> it would be very technically difficult (for a guy who knows some VB
> and PHP) to hook up an (external) program which would monitor the
> Special:Newpages RSS feed, then check the page content itself and
> potentially flag it as something to attend to if it met a set of
> variable characteristics -- i.e., is it less than 5 words long, does
> it contain the word "fart" or "gay" or "penis", was it created by an
> anon, is it wikified, and so forth. More complicated though still
> quite feasible operations could involve automatically putting a sample
> of the content through Google and seeing if anything comes up, or
> potentially checking for incoming links, etc. At the end of the day
> you could ideally run something which would check all of those marked
> as "potential problems" to see if they had been edited extensively or
> deleted, and then make it easy for the editor running the program to
> take a look at what remained. All in all it wouldn't put any more
> stress on the servers than a user who actually checked these things
> manually, and would potentially catch things that were missed by other
> diligant admins.
>
> Anything like this exist? If not, I might try to cobble one together
> in my (meager) spare time, though I warn you it'll be written in
> Visual Basic... Seems like it would help with at least one problem in
> relation to new pages, if not the more insidious one of false claims
> disguised as encyclopedia articles.
>
> FF
Gmaxwell has a bot written in Python that usually runs in
#wikipedia-en-suspectedits which reads every diff as it comes through.
A diff gets announced if it contains various words considered
offensive (profanity, ethnic slurs), words considered offensive if not
used in context ("gay" in an article where the previous revision
contained neither "homosexual" nor "gay" nor other similarly related
words), added exclamation points (usually at the least poor
encyclopedic style!), and speedy delete notices; he's taken a few
other suggestions for improvement.
Unfortunately it's not running at the moment as toolserver is still
down, but it's a pretty useful tool to aid RC patrol.
(Yes, Gmaxwell is my significant other, but that's not why I'm
plugging the bot, honest. :-))
-Kat
[[User:Mindspillage]]
--
"There was a point to this story, but it has temporarily
escaped the chronicler's mind." --Douglas Adams
More information about the WikiEN-l
mailing list