Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality

25 Nov 2008


      Maury,
perhaps I can help explain the behavior you saw in the UCSC system (I am one
of the developers).
New text is always somewhat orange, to signal to visitors that it has not
yet been fully reviewed.
The higher the reputation, the lighter the shade of orange, but orange it
still is (I have no idea of how high was your computed reputation when you
started writing that article).
Text background becomes white when other people revise it without
drastically changing it: this indicates consensus.
In our more recent code version, we also have a "vote" button; using this,
text can more speedily gain trust without need for many revisions to occur.
In a live experiment, where people can click on the vote button, I presume
the trust of the text would raise more rapidly.  Note that the code prevents
double voting, or creating sock-puppet accounts to vote, etc etc.
So I don't think based on what you say that the system is tripping over
diffs.  It is simply considering new text less trusted, and more revised
text more trusted, which is what we wanted.   It appears however we don't do
a very good job on the web site describing the algorithm (I guess we put
most of the description work in writing the papers... we will try to improve
the web site).
We don't measure "edit work" in number of edits, but in number of words
changed.
As you say, for our system, changing 1000 words in separate edits is the
same (provided the edits are all kept, i.e., not reverted) as providing a
single 1000-word contribution.   We thought of giving a larger prize to
larger contributions: precisely, of making the reputation increment
proportional to n^a, where n is the number of words, and a > 1.  This did
not work well for the Wikipedia, because it ended up not rewarding enough
the work of the many editors, who clean and polish the articles, thus making
many small edits.  Technically it would be trivial to change the code to
include such a non-linear reward scheme (to adopt rewards proportional to
n^a rather than n); whether it is desirable, I have no idea.  It does not
lead to better quantitative performance of the system, i.e., the resulting
trust is not better at predicting future text deletions.
Luca
...
The USCS system did work, but gave me odd results. Apparently I have a
very bad reputation, because when I look in the History at the first
versions, which I wrote in entirety, it colored it all yellow!
Newer versions of the same articles had much more white, even though
huge portions of the text were still from the origial. This may be due
to diff problems -- I consider diff to be largely random in
effectiveness, sometimes it works, but othertimes a single whitespace
change, especially vertical, will make it think the entire article was
edited.
My guess is that the system is tripping over diffs like this, and thus
considering the article to have been re-written by another editor.
Since this has happened, MY reputation goes down, or so I understand
it.
I don´t think this system could possibly work if based on wiki's
diffs. If its going to work it´s going to need to use a much more
reliable system.
Another problem I see with it is that it will rank an author who´s
contributions are 1000 unchanged comma inserts to be as reliable as an
author who created a perfect 1000 character article (or perhaps rate
the first even higher). There should be some sort of length bias, if
an author makes a big edit, out of character, that´s important to
know.
Maury

Wikipedia-l mailing list
Wikipedia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikipedia-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality