On 0, Tim Starling tstarling@wikimedia.org scribbled:
David Goodman wrote:
from Chronicle of Higher Education, Wired Campus blog. http://chronicle.com/wiredcampus/index.php?id=2278
"software that color-codes Wikipedia entries, identifying those portions deemed trustworthy and those that might be taken with a grain of salt.
I spoke with Luca de Alfaro at length about this feature at Wikimania. I think the technology is great, and the performance is probably good enough to include it on Wikipedia itself. He assures me he will release the source code under a free license, as soon as it's presentable. Some programming work still needs to be done to make it work incrementally rather than on the entire history of the article, but the theory for this is entirely in place.
There are two very important and separate elements to this:
- A "blame map". Some might prefer to call it a "credit map" to be more
polite. This is a data structure that lets you see who is responsible for what text. It can be updated on every edit. Having one stored in MediaWiki will enable all sorts of applications. Apparently it's old-hat, and not the subject of the present research, but it'll be great to have an implementation integrated with MediaWiki.
- A reputation metric. This is a predictor of how long a given user's
edits will stay in an article. It's novel, and it's the main topic of de Alfaro's research.
These two elements could be used independently in any way we choose.
The social implications of having a reputation metric are not lost on de Alfaro. He gives the following responses to the usual criticisms:
- The reputation metric does not rank respected, established users --
it has a maximum value which will be routinely obtained by many people.
- The value of the metric for individual users is obscured. The only
access to it is via the reputation-coloured article text. The annotated article text has no usernames attached, and the metric is not displayed on user pages or the like.
- It's content-based which makes it harder to game than voting-based
metrics.
It's time for us to think about how we want to use this technology. There are lots of possibilities beyond the precise design that de Alfaro proposes. Brainstorm away.
-- Tim Starling
The most exciting ones I can think of:
#We can scrap the 'newest 1%' part of semi-protection. Instead of waiting 4 days, write 4 articles! #We can scrap editcountitis - this reputation metric may still not be ideal, but I suspect the metric will reflect the value of one's contributions *a heckuva* lot better than # of edits. #Bots could probably benefit from this. An example: Pywikipedia's followlive.py script follows Newpages looking for dubious articles to display for the user to take action on. You could filter out all pages consisting of avg. reputation > n, or something. #People have long suggested that edits by anons and new users be buffered for a while or approved; this might be a way of doing it.
-- gwern MF Reflection NATOA Indigo AIEWS Weekly sorot Sex import Zen