On Mon, Nov 24, 2008 at 8:35 PM, Luca de Alfaro luca@dealfaro.org wrote: [snip]
So I don't think based on what you say that the system is tripping over diffs.
For example: I can't figure out why the text in the image caption is colored here http://wiki-trust.cse.ucsc.edu/index.php/Digital_room_correction
I couldn't initially figure out why *anything* above the external link section was colored… though the inability to diff contributed to that.
On Mon, Nov 24, 2008 at 8:22 PM, Luca de Alfaro luca@dealfaro.org wrote:
I agree with Gregory that it is very useful to quantify the usefulness of trust information on text -- otherwise, all comparison are very subjective. In our WikiSym 08 paper, we measure various parameters of the "trust" coloring we compute, including:
- Recall of deletions. Only 3.4% of text is in the lower half of trust
values, yet this is 66% of the text that is deleted in the very next revision.
- Precision of deletions. Text is the bottom half of trust values has
probability 33% of being deleted in the next revision, agaist a probability of 1.9% for general text. The deletion probability raises to 62% for text in the bottom 20% of trust values.
- We study the correlation between the trust of a word, sampled at random
in all revisions, and the future lifespan of a word (correcting for the finite horizon effect due to the finite number of revisions in each article), showing positive correlation.
[snip]
These performance metrics are better than I would have guessed from browsing through the output. How does the color mapping reflect the trust values? Basically when I use it I see a *lot* of colored things which are perfectly fine. At least for me, the difference between shades is far less cognitively significant than colored vs non-colored, so that may be the source of my confusion.
Have you compared your system to a simple toy trust metric? I'd propose "revisions by users in their first week and before their first 7 (?) edits are untrusted". This reflects the existing automatic trust system on the site (auto-confirmation), and also reflects the a type of trust checking applied manually by editors. I think thats the bar any more sophisticated trust metric needs to outperform.
Thank you so much for your response!