Right. Also, we need to be clear what we want this to do. It will never be great at determining fact-checked material. What it is good at is spotting the more dubious stuff, like possible vandalism. This makes the possibility of having "most trusted" stable version as discussed earlier. Small changes not only can be big in meaning, but they still attest to the trust.

If I read a sentence to change some minor thing, I still read it. If a wrongly says "he identifies himself as bisexual" or "born in 1885" rather than 1985 in a page when I edit, I am going to revert if I catch it. Even if just making some grammar/syntax cleanup. So each time people look at stuff if still attest to the page a little bit, from a vandalism perspective.

The algorithms can be made more strict to catch more general dubious info better, but it is not that bad at that already, and the stricter it gets, the more it gets under inclusive as to what is considered unlikely to be vandalized.

-Aaron Schulz


Date: Fri, 21 Dec 2007 10:34:47 -0800
From: luca@dealfaro.org
To: wikiquality-l@lists.wikimedia.org
Subject: Re: [Wikiquality-l] Wikipedia colored according to trust

If you want to pick out the malicious changes, you need to flag also small changes.

"Sen. Hillary Clinton did *not* vote in favor of war in Iraq"

"John Doe, born in *1947*"

The ** indicates changes.

I can very well make a system that is insensitive to small changes, but then the system would also be insensitive to many kinds of malicious tampering, and one of my goals was to make it hard for anyone to change without leaving at laest a minimal trace.

So it's a matter of goals, really.

Luca

On Dec 21, 2007 10:01 AM, Jonathan Leybovich <jleybov@gmail.com> wrote:
One thing that stood out for me in the small sample of articles I
examined was the flagging of innocuous changes by casual users to
correct spelling, grammar, etc.  Thus a "nice-to-have" would be a
"smoothing" algorithm that ignores inconsequential changes such as
spelling corrections, etc. or the reordering of semantically-contained
units of text (for example, reordering the line items in a list w/o
changing the content of any particular line item, etc., or the
reordering of paragraphs and perhaps even sentences.)  I think this
would cover 90% or more of changes that are immaterial to an article's
credibility.

_______________________________________________
Wikiquality-l mailing list
Wikiquality-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wikiquality-l




Get the power of Windows + Web with the new Windows Live. Get it now!