[WikiEN-l] "Software Weighs Wikipedians' Trustworthiness"
Tim Starling
tstarling at wikimedia.org
Mon Aug 6 05:47:47 UTC 2007
David Goodman wrote:
> from
> Chronicle of Higher Education, Wired Campus blog.
> http://chronicle.com/wiredcampus/index.php?id=2278
>
> "software that color-codes Wikipedia entries, identifying those
> portions deemed trustworthy and those that might be taken with a grain
> of salt.
>
I spoke with Luca de Alfaro at length about this feature at Wikimania. I
think the technology is great, and the performance is probably good
enough to include it on Wikipedia itself. He assures me he will release
the source code under a free license, as soon as it's presentable. Some
programming work still needs to be done to make it work incrementally
rather than on the entire history of the article, but the theory for
this is entirely in place.
There are two very important and separate elements to this:
1) A "blame map". Some might prefer to call it a "credit map" to be more
polite. This is a data structure that lets you see who is responsible
for what text. It can be updated on every edit. Having one stored in
MediaWiki will enable all sorts of applications. Apparently it's
old-hat, and not the subject of the present research, but it'll be great
to have an implementation integrated with MediaWiki.
2) A reputation metric. This is a predictor of how long a given user's
edits will stay in an article. It's novel, and it's the main topic of de
Alfaro's research.
These two elements could be used independently in any way we choose.
The social implications of having a reputation metric are not lost on de
Alfaro. He gives the following responses to the usual criticisms:
1. The reputation metric does not rank respected, established users --
it has a maximum value which will be routinely obtained by many people.
2. The value of the metric for individual users is obscured. The only
access to it is via the reputation-coloured article text. The annotated
article text has no usernames attached, and the metric is not displayed
on user pages or the like.
3. It's content-based which makes it harder to game than voting-based
metrics.
It's time for us to think about how we want to use this technology.
There are lots of possibilities beyond the precise design that de Alfaro
proposes. Brainstorm away.
-- Tim Starling
More information about the WikiEN-l
mailing list