[Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality

Tue Nov 25 04:14:57 UTC 2008

On Mon, Nov 24, 2008 at 8:35 PM, Luca de Alfaro <luca at dealfaro.org> wrote:
[snip]
> So I don't think based on what you say that the system is tripping over
> diffs.

For example: I can't figure out why the text in the image caption is
colored here
http://wiki-trust.cse.ucsc.edu/index.php/Digital_room_correction

I couldn't initially figure out why *anything* above the external link
section was colored… though the inability to diff contributed to that.

On Mon, Nov 24, 2008 at 8:22 PM, Luca de Alfaro <luca at dealfaro.org> wrote:
> I agree with Gregory that it is very useful to quantify the usefulness of
> trust information on text -- otherwise, all comparison are very subjective.
> In our WikiSym 08 paper, we measure various parameters of the "trust"
> coloring we compute, including:
>
>   - Recall of deletions.  Only 3.4% of text is in the lower half of trust
>   values, yet this is 66% of the text that is deleted in the very next
>   revision.
>   - Precision of deletions.  Text is the bottom half of trust values has
>   probability 33% of being deleted in the next revision, agaist a probability
>   of 1.9% for general text.  The deletion probability raises to 62% for text
>   in the bottom 20% of trust values.
>   - We study the correlation between the trust of a word, sampled at random
>   in all revisions, and the future lifespan of a word (correcting for the
>   finite horizon effect due to the finite number of revisions in each
>   article), showing positive correlation.
[snip]

These performance metrics are better than I would have guessed from
browsing through the output. How does the color mapping reflect the
trust values?  Basically when I use it I see a *lot* of colored things
which are perfectly fine. At least for me, the difference between
shades is far less cognitively significant than colored vs
non-colored, so that may be the source of my confusion.

Have you compared your system to a simple toy trust metric?  I'd
propose "revisions by users in their first week and before their first
7 (?) edits are untrusted".  This reflects the existing automatic
trust system on the site (auto-confirmation), and also reflects the a
type of trust checking applied manually by editors.   I think thats
the bar any more sophisticated trust metric needs to outperform.

Thank you so much for your response!