On Mon, Nov 24, 2008 at 8:35 PM, Luca de Alfaro <luca(a)dealfaro.org> wrote:
[snip]
So I don't think based on what you say that the
system is tripping over
diffs.
For example: I can't figure out why the text in the image caption is
colored here
http://wiki-trust.cse.ucsc.edu/index.php/Digital_room_correction
I couldn't initially figure out why *anything* above the external link
section was colored… though the inability to diff contributed to that.
On Mon, Nov 24, 2008 at 8:22 PM, Luca de Alfaro <luca(a)dealfaro.org> wrote:
I agree with Gregory that it is very useful to
quantify the usefulness of
trust information on text -- otherwise, all comparison are very subjective.
In our WikiSym 08 paper, we measure various parameters of the "trust"
coloring we compute, including:
- Recall of deletions. Only 3.4% of text is in the lower half of trust
values, yet this is 66% of the text that is deleted in the very next
revision.
- Precision of deletions. Text is the bottom half of trust values has
probability 33% of being deleted in the next revision, agaist a probability
of 1.9% for general text. The deletion probability raises to 62% for text
in the bottom 20% of trust values.
- We study the correlation between the trust of a word, sampled at random
in all revisions, and the future lifespan of a word (correcting for the
finite horizon effect due to the finite number of revisions in each
article), showing positive correlation.
[snip]
These performance metrics are better than I would have guessed from
browsing through the output. How does the color mapping reflect the
trust values? Basically when I use it I see a *lot* of colored things
which are perfectly fine. At least for me, the difference between
shades is far less cognitively significant than colored vs
non-colored, so that may be the source of my confusion.
Have you compared your system to a simple toy trust metric? I'd
propose "revisions by users in their first week and before their first
7 (?) edits are untrusted". This reflects the existing automatic
trust system on the site (auto-confirmation), and also reflects the a
type of trust checking applied manually by editors. I think thats
the bar any more sophisticated trust metric needs to outperform.
Thank you so much for your response!