Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality

25 Nov 2008

On Mon, Nov 24, 2008 at 8:35 PM, Luca de Alfaro &lt;luca(a)dealfaro.org&gt; wrote:
[snip]
...
  So I don't think based on what you say that the
system is tripping over
 diffs. 
For example: I can't figure out why the text in the image caption is
colored here
http://wiki-trust.cse.ucsc.edu/index.php/Digital_room_correction

I couldn't initially figure out why *anything* above the external link
section was colored… though the inability to diff contributed to that.

On Mon, Nov 24, 2008 at 8:22 PM, Luca de Alfaro &lt;luca(a)dealfaro.org&gt; wrote:
...
  I agree with Gregory that it is very useful to
quantify the usefulness of
 trust information on text -- otherwise, all comparison are very subjective.
 In our WikiSym 08 paper, we measure various parameters of the "trust"
 coloring we compute, including:

   - Recall of deletions.  Only 3.4% of text is in the lower half of trust
   values, yet this is 66% of the text that is deleted in the very next
   revision.
   - Precision of deletions.  Text is the bottom half of trust values has
   probability 33% of being deleted in the next revision, agaist a probability
   of 1.9% for general text.  The deletion probability raises to 62% for text
   in the bottom 20% of trust values.
   - We study the correlation between the trust of a word, sampled at random
   in all revisions, and the future lifespan of a word (correcting for the
   finite horizon effect due to the finite number of revisions in each
   article), showing positive correlation. [snip]

These performance metrics are better than I would have guessed from
browsing through the output. How does the color mapping reflect the
trust values?  Basically when I use it I see a *lot* of colored things
which are perfectly fine. At least for me, the difference between
shades is far less cognitively significant than colored vs
non-colored, so that may be the source of my confusion.

Have you compared your system to a simple toy trust metric?  I'd
propose "revisions by users in their first week and before their first
7 (?) edits are untrusted".  This reflects the existing automatic
trust system on the site (auto-confirmation), and also reflects the a
type of trust checking applied manually by editors.   I think thats
the bar any more sophisticated trust metric needs to outperform.

Thank you so much for your response!

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality