[WikiEN-l] Automatic death flagging?

Andrew Gray shimgray at gmail.com
Thu Mar 5 18:56:39 UTC 2009


A month or two ago, someone wrote to OTRS asking if we had any way of
displaying a list of people who'd just been listed as "dead" by
Wikipedia. It strikes me that this is quite an interesting idea - on
the one hand, there's some interest from reusers about a ticker of
"recent obituaries", and on the other hand, it's useful for *us* so we
can keep an eye on subtle vandalism and ensure we have cast-iron
confirmation of any reported death... it being, of course, quite
embarrasing to report someone's dead when they aren't.

So, I started thinking. How can we do this? By my rough calculations,
we should have about a dozen "new deaths" a day, among the entire
universe of our articles, so it certainly isn't an unmanageable list
to work with.

Leaving aside the matter of edits to the article text - which'd take
quite a bit of parsing - there are three methods I can see whereby we
could get the material for this:

a) [[Deaths in YEAR]] - entries are manually added to the list
b) {{Recent death}} - this is added to the articles of *some* dead people
c) [[Category:Living people]] - dead people have the cat removed

(Thanks to Jim Redmond for helping me work these out)

a) is perhaps the least timely; a rough experiment suggests that for
any given day, you're only going to get a few of the people who died
that day listed, and that even a week on, several will be redlinks.
It's the easiest one to automate, though - we can just fire off the
RSS feed of the page history - and seems to require solid confirmation
from an external source.

b) is patchy - not everyone adds the template, and sometimes it's
added and then removed. It should be fairly reliable, though, since it
de-facto implies an experienced user is working on the article; it's
not the sort of thing most casual vandals will add!

c) is probably the most comprehensive one - everyone gets it removed
eventually - and also the quickest to update. However, it will have
the highest false positive rate - categories can be removed for a
whole number of reasons, including vandalism and simple mistakes, and
we'd run the risk of triggering this every time a page was blanked.
Plus, it might be computationally time-consuming to have something
checking this regularly...

Any thoughts?

-- 
- Andrew Gray
  andrew.gray at dunelm.org.uk



More information about the WikiEN-l mailing list