Flagged Revisions: de:wp 99.5% reviewed

On Tue, Feb 3, 2009 at 1:14 AM, David Gerard dgerard@gmail.com wrote:

...
http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=ove...

To my mind the more important statistic is that 98% of all articles have had their most recent revision reviewed.

I agree, that's definitely the most important statistic. A more useful statistic would be the age of the oldest unreviewed revision.

Stephen Bain

11:16 p.m.

On Tue, Feb 3, 2009 at 2:03 AM, Thomas Dalton thomas.dalton@gmail.com wrote:

...

I agree, that's definitely the most important statistic. A more useful statistic would be the age of the oldest unreviewed revision.

What would also be useful would be to put together the list of articles with outdated reviews with those articles' page view statistics. Put that together with measurements of the amount of time that articles spend with outdated reviews and you would have a pretty good picture of the real effect on readers of outdated revisions (which may be more or less than the raw 1.5% figure).

(Note that it's 0.5% of articles that don't have any reviewed revisions, and 1.5% that have a non-current revision reviewed. Information about the ages of revisions in both these categories would be interesting.)

-- Stephen Bain stephen.bain@gmail.com

Sam Korn

11:19 p.m.

On Mon, Feb 2, 2009 at 3:03 PM, Thomas Dalton thomas.dalton@gmail.com wrote:

...

2009/2/2 Stephen Bain stephen.bain@gmail.com:

...
On Tue, Feb 3, 2009 at 1:14 AM, David Gerard dgerard@gmail.com wrote:

...
http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=ove...

To my mind the more important statistic is that 98% of all articles have had their most recent revision reviewed.

I agree, that's definitely the most important statistic. A more useful statistic would be the age of the oldest unreviewed revision.

17.8 days http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=out...

-- Sam PGP public key: http://en.wikipedia.org/wiki/User:Sam_Korn/public_key

Thomas Dalton

3 Feb 3 Feb

2:30 a.m.

2009/2/2 Sam Korn smoddy@gmail.com:

...

On Mon, Feb 2, 2009 at 3:03 PM, Thomas Dalton thomas.dalton@gmail.com wrote:

...
I agree, that's definitely the most important statistic. A more useful statistic would be the age of the oldest unreviewed revision.

17.8 days http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=out...

Ask, and you shall receive! Thank you!

So that's 10570 articles that have been waiting over a day (out of 12667 articles with out of date reviews). That's pretty bad... I would have expected a long tail type distribution. Any ideas why there are so many very out-of-date article compared to slightly out-of-date ones?

Carcharoth

2:44 a.m.

On Mon, Feb 2, 2009 at 6:30 PM, Thomas Dalton thomas.dalton@gmail.com wrote:

...

2009/2/2 Sam Korn smoddy@gmail.com:

...
On Mon, Feb 2, 2009 at 3:03 PM, Thomas Dalton thomas.dalton@gmail.com wrote:

...
I agree, that's definitely the most important statistic. A more useful statistic would be the age of the oldest unreviewed revision.

17.8 days http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=out...

Ask, and you shall receive! Thank you!

So that's 10570 articles that have been waiting over a day (out of 12667 articles with out of date reviews). That's pretty bad... I would have expected a long tail type distribution. Any ideas why there are so many very out-of-date article compared to slightly out-of-date ones?

Might it be because they were looked at several times and each time people went "um, not sure about this" and left it for someone else to do? Flagged revisions is serious because the impression is that you are verifying people's work to some standard. Now if someone quote an obscure source, but you don't have or haven't heard of that source, what do you do? Trust the editor? Let it go through anyway? Let someone else deal with it and see a backlog build up?

What I'd like to see is a feature where you can click "not sure" and bump the review up several levels of expertise, so the difficult stuff gets naturally filtered to those with the expertise. Say, subject matter or foreign language, or obscure book. Depending on how flexible such a system is, it might make flagging revisions more efficient, not less.

Training people to do rudimentary and moderate and advanced reviews would be next.

Extremely dififcult to scale and harness the right levels of expertise (from typo-spotting upwards), but very rewarding if done right. One problem is edits that combine different sorts of things, and the "massive chunks of text added in one go".

I presume the current system is a rudimentary one only designed to catch obvious vandalism? If that is the case, people need to be more alert than before (not less) to subtle vandalism and good-faith misrepresentation of sources by poor or skewed writing.

Carcharoth

Thomas Dalton

2:53 a.m.

2009/2/2 Carcharoth carcharothwp@googlemail.com:

...

On Mon, Feb 2, 2009 at 6:30 PM, Thomas Dalton thomas.dalton@gmail.com wrote:

...
2009/2/2 Sam Korn smoddy@gmail.com:

...
On Mon, Feb 2, 2009 at 3:03 PM, Thomas Dalton thomas.dalton@gmail.com wrote:

...
I agree, that's definitely the most important statistic. A more useful statistic would be the age of the oldest unreviewed revision.

17.8 days http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=out...

Ask, and you shall receive! Thank you!

So that's 10570 articles that have been waiting over a day (out of 12667 articles with out of date reviews). That's pretty bad... I would have expected a long tail type distribution. Any ideas why there are so many very out-of-date article compared to slightly out-of-date ones?

Might it be because they were looked at several times and each time people went "um, not sure about this" and left it for someone else to do? Flagged revisions is serious because the impression is that you are verifying people's work to some standard. Now if someone quote an obscure source, but you don't have or haven't heard of that source, what do you do? Trust the editor? Let it go through anyway? Let someone else deal with it and see a backlog build up?

I would have expected that to lead to a long tail - the longer a revision has been around, the more chance someone will have been sure enough to do something about it.

...

What I'd like to see is a feature where you can click "not sure" and bump the review up several levels of expertise, so the difficult stuff gets naturally filtered to those with the expertise. Say, subject matter or foreign language, or obscure book. Depending on how flexible such a system is, it might make flagging revisions more efficient, not less.

Training people to do rudimentary and moderate and advanced reviews would be next.

Extremely dififcult to scale and harness the right levels of expertise (from typo-spotting upwards), but very rewarding if done right. One problem is edits that combine different sorts of things, and the "massive chunks of text added in one go".

I presume the current system is a rudimentary one only designed to catch obvious vandalism? If that is the case, people need to be more alert than before (not less) to subtle vandalism and good-faith misrepresentation of sources by poor or skewed writing.

As I understand it, the German implementation has quite strict requirements for flagging. The suggestions on the English Wikipedia seem to be more about just stopping obvious vandalism. Different levels of flags would seem to be the solution (and have been discussed before, but I guess we need to wait until we have the basics working) - "sighted", "fact checked", "good", "featured", say. Anyone that is autoconfirmed can sight a revision with just a few seconds of review in most cases, fact checking requires you to have proven yourself competent and takes longer since you have to actually verify all the sources (and may need to have some experience of the subject in question in order to know if the sources have been correctly interpreted). Good and featured would follow existing procedure (and just make it easier to make it clear which revision was reviewed).

Peter Jacobi

5:41 p.m.

I've just checked a small sample of >10d unreviewed changes from the list. About 50% are not reviewed for unknown reasons, the can (and I have) be given the flag within 30 seconds of reading (style changes, URL changes).

The other half are unreferenced additions to articles nobody cares about (small towns, biograohies of rather unknown persons, unimportant music groups).

This relates to Thomas' posting:

Thomas Dalton thomas.dalton@gmail.com:

...

As I understand it, the German implementation has quite strict requirements for flagging. The suggestions on the English Wikipedia seem to be more about just stopping obvious vandalism.

It can't be said for sure, whether these additions are vandalism (jokes) or valid ones. Bayesian statistics would suggest just reviewing them, as they are more likely not vandalism.

OTOH, requiring references for each addition would solve the problem in the other direction.

Regards, Peter

-- Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL für nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a

Charles Matthews

8:04 p.m.

Peter Jacobi wrote:

...

OTOH, requiring references for each addition would solve the problem in the other direction.

Every time I've discussed specifics of "flags" I have come away confused (admittedly, that is not very often). But, as I understand it, it is technically possible to have numerous types of flags. Therefore it is surely possible to adapr the system to cover referencing issues.

I think that the argument that no flagged revision system can exclude 100% of bad edits is the wrong argument, given our tradition of "soft security" where each vetting process may only catch 90% of problems: we should be grateful for that 90%. So if there is a rather small percentage of edits that are missed by the reviewing system, I would suggest some secondary flagging to handle those, taking them out of the main system. At the very least, it is a good idea to reduce duplication of work.

Charles

Carcharoth

8:10 p.m.

On Tue, Feb 3, 2009 at 12:04 PM, Charles Matthews charles.r.matthews@ntlworld.com wrote:

...

Peter Jacobi wrote:

...
OTOH, requiring references for each addition would solve the problem in the other direction.

Every time I've discussed specifics of "flags" I have come away confused

I'm hoping it will work in practice like wikisource, where there are four levels of approval as a text goes through the various transcription and proofreading stages. But I may be misunderstanding the differences. To see flagged revisions in action, as far as I'm aware, the best thing to do is go to the German Wikipedia or a test wiki (is there one?).

Carcharoth

John Vandenberg

9 Feb 9 Feb

5:10 p.m.

On Tue, Feb 3, 2009 at 11:10 PM, Carcharoth carcharothwp@googlemail.com wrote:

...

I'm hoping it will work in practice like wikisource, where there are four levels of approval as a text goes through the various transcription and proofreading stages. But I may be misunderstanding the differences.

To see flagged revisions in action, as far as I'm aware, the best thing to do is go to the German Wikipedia or a test wiki (is there one?).

More info here: http://quality.wikimedia.org/

Test Wiki has it enabled:

http://test.wikipedia.org/wiki/Special:UnreviewedPages

So does the En lab (using English Wikibooks data):

http://en.labs.wikimedia.org/

English Wikibooks has it in production:

http://en.wikibooks.org/wiki/Using_Wikibooks/Reviewing_Pages

English Wikinews has it in production:

http://en.wikinews.org/wiki/Wikinews:Flagged_revisions

Hebrew and Russian Wikisource have Flagged Revs enabled; I've not checked how well they are using it. http://wikisource.org/wiki/WS:COORD

-- John Vandenberg

Nathan

3 Feb 3 Feb

2:59 a.m.

On Mon, Feb 2, 2009 at 1:44 PM, Carcharoth carcharothwp@googlemail.comwrote:

...

Might it be because they were looked at several times and each time people went "um, not sure about this" and left it for someone else to do? Flagged revisions is serious because the impression is that you are verifying people's work to some standard. Now if someone quote an obscure source, but you don't have or haven't heard of that source, what do you do? Trust the editor? Let it go through anyway? Let someone else deal with it and see a backlog build up?

What I'd like to see is a feature where you can click "not sure" and bump the review up several levels of expertise, so the difficult stuff gets naturally filtered to those with the expertise. Say, subject matter or foreign language, or obscure book. Depending on how flexible such a system is, it might make flagging revisions more efficient, not less.

Training people to do rudimentary and moderate and advanced reviews would be next.

Extremely dififcult to scale and harness the right levels of expertise (from typo-spotting upwards), but very rewarding if done right. One problem is edits that combine different sorts of things, and the "massive chunks of text added in one go".

I presume the current system is a rudimentary one only designed to catch obvious vandalism? If that is the case, people need to be more alert than before (not less) to subtle vandalism and good-faith misrepresentation of sources by poor or skewed writing.

Carcharoth

I think FlaggedRevs main object is to reduce vandalism and obvious misinformation. Its a fairly blunt tool, one that isn't well suited to protecting content more comprehensively. That shouldn't be held against it, in my opinion - it is well suited for the task for which it was intended.

As for an escalation track for flagging a revision based on expertise - well, I don't think that is likely to happen or to work. There are too many obvious flaws for a ladder based system to function. What might conceivably work in this vein is an intermediate flag for a revision that says "expertise needed", but I don't think we've really discussed flagging or not flagging revisions based on evaluations of the depth you're implying. Something that would need probably years more discussion.

Nathan

Thomas Dalton

3:09 a.m.

2009/2/2 Nathan nawrich@gmail.com:

...

I think FlaggedRevs main object is to reduce vandalism and obvious misinformation. Its a fairly blunt tool, one that isn't well suited to protecting content more comprehensively. That shouldn't be held against it, in my opinion - it is well suited for the task for which it was intended.

In what way is it blunt? It's extremely customisable. You can have a variety of different flags, issued by a variety of different people, with a variety of different default views for different pages and different flags.

5774

Age (days ago)

5781

Last active (days ago)

wikien-l@lists.wikimedia.org

13 comments

9 participants

tags (0)

participants (9)

Carcharoth
Charles Matthews
David Gerard
John Vandenberg
Nathan
Peter Jacobi
Sam Korn
Stephen Bain
Thomas Dalton