[WikiEN-l] Automatic death flagging?
Andrew Gray
shimgray at gmail.com
Thu Mar 5 18:56:39 UTC 2009
A month or two ago, someone wrote to OTRS asking if we had any way of
displaying a list of people who'd just been listed as "dead" by
Wikipedia. It strikes me that this is quite an interesting idea - on
the one hand, there's some interest from reusers about a ticker of
"recent obituaries", and on the other hand, it's useful for *us* so we
can keep an eye on subtle vandalism and ensure we have cast-iron
confirmation of any reported death... it being, of course, quite
embarrasing to report someone's dead when they aren't.
So, I started thinking. How can we do this? By my rough calculations,
we should have about a dozen "new deaths" a day, among the entire
universe of our articles, so it certainly isn't an unmanageable list
to work with.
Leaving aside the matter of edits to the article text - which'd take
quite a bit of parsing - there are three methods I can see whereby we
could get the material for this:
a) [[Deaths in YEAR]] - entries are manually added to the list
b) {{Recent death}} - this is added to the articles of *some* dead people
c) [[Category:Living people]] - dead people have the cat removed
(Thanks to Jim Redmond for helping me work these out)
a) is perhaps the least timely; a rough experiment suggests that for
any given day, you're only going to get a few of the people who died
that day listed, and that even a week on, several will be redlinks.
It's the easiest one to automate, though - we can just fire off the
RSS feed of the page history - and seems to require solid confirmation
from an external source.
b) is patchy - not everyone adds the template, and sometimes it's
added and then removed. It should be fairly reliable, though, since it
de-facto implies an experienced user is working on the article; it's
not the sort of thing most casual vandals will add!
c) is probably the most comprehensive one - everyone gets it removed
eventually - and also the quickest to update. However, it will have
the highest false positive rate - categories can be removed for a
whole number of reasons, including vandalism and simple mistakes, and
we'd run the risk of triggering this every time a page was blanked.
Plus, it might be computationally time-consuming to have something
checking this regularly...
Any thoughts?
--
- Andrew Gray
andrew.gray at dunelm.org.uk
More information about the WikiEN-l
mailing list