[Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality

Luca de Alfaro luca at dealfaro.org
Tue Nov 25 01:35:13 UTC 2008


Maury,

perhaps I can help explain the behavior you saw in the UCSC system (I am one
of the developers).
New text is always somewhat orange, to signal to visitors that it has not
yet been fully reviewed.
The higher the reputation, the lighter the shade of orange, but orange it
still is (I have no idea of how high was your computed reputation when you
started writing that article).

Text background becomes white when other people revise it without
drastically changing it: this indicates consensus.
In our more recent code version, we also have a "vote" button; using this,
text can more speedily gain trust without need for many revisions to occur.
In a live experiment, where people can click on the vote button, I presume
the trust of the text would raise more rapidly.  Note that the code prevents
double voting, or creating sock-puppet accounts to vote, etc etc.

So I don't think based on what you say that the system is tripping over
diffs.  It is simply considering new text less trusted, and more revised
text more trusted, which is what we wanted.   It appears however we don't do
a very good job on the web site describing the algorithm (I guess we put
most of the description work in writing the papers... we will try to improve
the web site).

We don't measure "edit work" in number of edits, but in number of words
changed.
As you say, for our system, changing 1000 words in separate edits is the
same (provided the edits are all kept, i.e., not reverted) as providing a
single 1000-word contribution.   We thought of giving a larger prize to
larger contributions: precisely, of making the reputation increment
proportional to n^a, where n is the number of words, and a > 1.  This did
not work well for the Wikipedia, because it ended up not rewarding enough
the work of the many editors, who clean and polish the articles, thus making
many small edits.  Technically it would be trivial to change the code to
include such a non-linear reward scheme (to adopt rewards proportional to
n^a rather than n); whether it is desirable, I have no idea.  It does not
lead to better quantitative performance of the system, i.e., the resulting
trust is not better at predicting future text deletions.

Luca


> The USCS system did work, but gave me odd results. Apparently I have a
> very bad reputation, because when I look in the History at the first
> versions, which I wrote in entirety, it colored it all yellow!
>
> Newer versions of the same articles had much more white, even though
> huge portions of the text were still from the origial. This may be due
> to diff problems -- I consider diff to be largely random in
> effectiveness, sometimes it works, but othertimes a single whitespace
> change, especially vertical, will make it think the entire article was
> edited.
>
> My guess is that the system is tripping over diffs like this, and thus
> considering the article to have been re-written by another editor.
> Since this has happened, MY reputation goes down, or so I understand
> it.
>
> I don´t think this system could possibly work if based on wiki's
> diffs. If its going to work it´s going to need to use a much more
> reliable system.
>
> Another problem I see with it is that it will rank an author who´s
> contributions are 1000 unchanged comma inserts to be as reliable as an
> author who created a perfect 1000 character article (or perhaps rate
> the first even higher). There should be some sort of length bias, if
> an author makes a big edit, out of character, that´s important to
> know.
>
> Maury
>
> _______________________________________________
> Wikipedia-l mailing list
> Wikipedia-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>


More information about the Wikipedia-l mailing list