My experiment has concluded and all the link removals reverted*. The full writeup is at http://www.gwern.net/In%20Defense%20Of%20Inclusionism#sins-of-omission-exper...
Result: Of the 100 removals, just 3 were reverted.
3% is even lower than I expected, and very different from Horologium's estimate, incidentally.
* for those who wish to check, feel free to cross-reference the list of diffs http://www.gwern.net/In%20Defense%20Of%20Inclusionism#link-removals against my recent edits: http://en.wikipedia.org/w/index.php?title=Special:Contributions/Gwern&of...
On 5/30/12, Gwern Branwen gwern0@gmail.com wrote:
http://en.wikipedia.org/w/index.php?title=Special:Contributions/Gwern&of...
You can out a date limiter on that URL so it won't become outdated.
This one should work indefinitely (unless some of the edits get deleted):
http://en.wikipedia.org/w/index.php?title=Special:Contributions/Gwern&of...
Carcharoth
PS. You didn't have to spam links to your 'experiment' in the revert edit summaries, you know. Some good-faith editors may get upset by that. The edit summary was:
"rv test of editors for this page; you failed. see http://www.gwern.net/In%20Defense%20Of%20Inclusionism#sins-of-omission-exper..."
This is something else that could have benefitted from outside input. Some of the attitude you have towards all this rolls off the page, with phrases such as "perhaps editors collectively know that putting a link into a section named ‘External Links’ is painting a cross-hair on its forehead". My view is that if such experiments are to be carried out, it would be better if they were designed and conducted by those able to restrain themselves from such snark.
Carcharoth
On Wed, May 30, 2012 at 2:39 PM, Carcharoth carcharothwp@googlemail.com wrote:
You can out a date limiter on that URL so it won't become outdated. This one should work indefinitely (unless some of the edits get deleted):
http://en.wikipedia.org/w/index.php?title=Special:Contributions/Gwern&of...
Neat. I didn't know we could do that.
On Wed, May 30, 2012 at 2:43 PM, Carcharoth carcharothwp@googlemail.com wrote:
PS. You didn't have to spam links to your 'experiment' in the revert edit summaries, you know. Some good-faith editors may get upset by that.
I disagree. The edit summary box is far too short to include any real explanation, so a link to the full explanation is best. The other alternative is to include no explanation in any form, and I regard that as unacceptable - people should know why some an apparently useless edit and revert were done.
The edit summary was:
"rv test of editors for this page; you failed. see http://www.gwern.net/In%20Defense%20Of%20Inclusionism#sins-of-omission-exper..."
This is something else that could have benefitted from outside input. Some of the attitude you have towards all this rolls off the page, with phrases such as "perhaps editors collectively know that putting a link into a section named ‘External Links’ is painting a cross-hair on its forehead".
I should pretend I have no point of view and I am disinterested while somehow not being uninterested? Academics may have to adopt such an imposture, but I do not. As long as my 'snark' does not change the results - as it does not - I do not care.
My view is that if such experiments are to be carried out, it would be better if they were designed and conducted by those able to restrain themselves from such snark.
Better how?
On 30 May 2012 20:41, Gwern Branwen gwern0@gmail.com wrote:
My view is that if such experiments are to be carried out, it would be better if they were designed and conducted by those able to restrain themselves from such snark.
Better how?
I'll add this to my list of "If you have to ask, you may never know" topics.
Charles
There were a number of flaws in this experiment that IMHO reduce its value.
Firstly rather than measure vandalism it created vandalism, and vandalism that didn't look like typical vandalism. Aside from the ethical issue involved, this will have skewed the result. In particular the edit summaries were very atypical for vandalism, if I'd seen that edit summary on my watchlist I would probably have just sighed and taken it as another example of deletionism in action. Of the more than 13,000 pages on my watchlist I doubt there are 13 where I would look at such an edit, and that's if it was one of the changes on my watchlist that I was even aware of - it is far too big to fully check every day. Most IP vandals don't use jargon in edit summaries, and I know I'm not the only editor who is more suspicious of IP edits with blank edit summaries.
You only ran the experiment for one month. I often revert older vandalism than that, I may be unusual there in that I've got some tools for finding vandalism that has got past the hugglers, but I'm not unusual in sometimes taking articles back to the "last clean version". I'm also aware that we have a large number of editors who we don't see every month - probably the fastest growing part of the community. For example we have 543 admins who have edited in the last three months but have fewer than thirty edits in the last two months, http://en.wikipedia.org/wiki/Wikipedia:List_of_administrators/Semi-activeMan... of these will be among the editors who visit occasionally to keep an eye on some articles that they care about.
If someone wants to work out how good we are at clearing up vandalism then I would suggest a more accurate way would be to:
1. Take a random set of edits from 12 months ago. 2. Check all of them, including the deleted ones, and classify them as vandalism or not 3. Measure how long they persisted in the article 4. Revert any vandalism still extant.
Of course that won't pick up those pre-empted by the edit filter, but if the sample was sufficiently large and random it would give us a measure of the effectiveness of our anti-vandalism work.
Regards
WSC
On 30 May 2012 23:02, Charles Matthews charles.r.matthews@ntlworld.comwrote:
On 30 May 2012 20:41, Gwern Branwen gwern0@gmail.com wrote:
My view is that if such experiments are to be carried out, it would be better if they were designed and conducted by those able to restrain themselves from such snark.
Better how?
I'll add this to my list of "If you have to ask, you may never know" topics.
Charles
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
On 31 May 2012 16:59, WereSpielChequers werespielchequers@gmail.com wrote:
There were a number of flaws in this experiment that IMHO reduce its value.
Firstly rather than measure vandalism it created vandalism, and vandalism that didn't look like typical vandalism. Aside from the ethical issue involved, this will have skewed the result. In particular the edit summaries were very atypical for vandalism, if I'd seen that edit summary on my watchlist I would probably have just sighed and taken it as another example of deletionism in action. Of the more than 13,000 pages on my watchlist I doubt there are 13 where I would look at such an edit, and that's if it was one of the changes on my watchlist that I was even aware of - it is far too big to fully check every day. Most IP vandals don't use jargon in edit summaries, and I know I'm not the only editor who is more suspicious of IP edits with blank edit summaries.
This, I think, is a major issue which make the results useless
* The edit summary implies policy knowledge, I'd only check an edit like that on my watchlist on occasion. Not every edit needs checking, so we use our common sense over what likely need checking
* I believe that edit summary probably met a number of heuristics used by the anti-vandal tools to filter out "good" edits. Which means it immediately removes them from the "front line" of scrutiny.
Tom
On 31 May 2012 17:03, Thomas Morton morton.thomas@googlemail.com wrote:
This, I think, is a major issue which make the results useless
- The edit summary implies policy knowledge, I'd only check an edit like
that on my watchlist on occasion. Not every edit needs checking, so we use our common sense over what likely need checking
- I believe that edit summary probably met a number of heuristics used by
the anti-vandal tools to filter out "good" edits. Which means it immediately removes them from the "front line" of scrutiny.
It does demonstrate a problem with our processes, though. There are three ways in which bad edits can get reverted:
1) They get spotted on recent changes (probably using automated or semi-automated tools these days). It isn't practical to check every edit, so you can get your edit skipped over fairly easily by just giving it a good edit summary. If it isn't reverted within a few minutes, it isn't going to get spotted by this first line of defence.
2) They get spotted on someone's watchlist. Watchlists don't move as fast as recent changes, so you get a few hours, maybe even a couple of days, in which to spot something, but again good edit summaries will cause you to ignore an edit to an article you aren't watching too closely. That means only articles that are watched by someone that cares enough about them to check every edit, and where that someone checks their watchlist within a few hours of the edit being made, will get protected by this second line.
3) The third line is someone going to the article for some other reason, spotted the vandalism and fixing it. There is no time limit for this, and it isn't unusual for vandalism to an obscure article to be fixed months after it happened. This line isn't going to detect bad removals, though, since there is nothing there to spot.
That means bad removals with good edit summaries to articles that aren't closely watched will often never get reverted.
This could be improved by making it more practical to check every edit, perhaps using the flagged revisions feature (at the moment, we probably check suspicious looking edits multiple times, so there is spare resource to check the others if we could just be more efficient about it).
Gwern Branwen wrote:
...Academics may have to adopt such an imposture, but I do not. As long as my 'snark' does not change the results - as it does not - I do not care.
Bully for you.
My view is that if such experiments are to be carried out, it would be better if they were designed and conducted by those able to restrain themselves from such snark.
Better how?
Because, as a wise acquaintance recently told me, very simply: Delivery matters.
There is a mainstream out there which, like it or not, adopts and prefers a certain demeanor different from yours. You may pride yourself on not caring, on imagining that only the literal truth of your argument matters, but to that mainstream, other aspects *do* matter. If they dislike your snarky attitude, they reflexively mistrust both you and your argument -- and, to some extent, the community you associate with.
On Wed, May 30, 2012 at 2:33 PM, Gwern Branwen gwern0@gmail.com wrote:
Result: Of the 100 removals, just 3 were reverted.
You removed 100 external links and only 3 of the removals were reverted. I don't find that very surprising. My experience with external links is that *on average* they are low quality, and many of them do fail WP:EL. Of course there are good external links, but they are a minority on the articles I follow. Examples include these removals:
http://en.wikipedia.org/w/index.php?title=Scala_%28programming_language%29&a... http://en.wikipedia.org/w/index.php?title=HUD_%28video_gaming%29&diff=pr...
I suppose I am not a "deletionist" because I only remove the ones that are most blatantly spam or free of content. But there are many more that I would not worry about if someone else removed them.
Separately, the median number of watchlisters for the 100 pages you edited is 5. And we have no way to get the names of the watchlisters to see whether they are active. So for many of the pages, it seems plausible nobody even noticed that the link was removed. That is a separate issue unrelated to links.
- Carl
On Thu, May 31, 2012 at 8:31 AM, Carl (CBM) cbm.wikipedia@gmail.com wrote:
Of course there are good external links, but they are a minority on the articles I follow. Examples include these removals:
http://en.wikipedia.org/w/index.php?title=Scala_%28programming_language%29&a... http://en.wikipedia.org/w/index.php?title=HUD_%28video_gaming%29&diff=pr...
I actually find your examples amusing. The HUD link was one of the ones I was seriously considering not restoring because it was a junk link; while I was especially disappointed to see that the Scala editors did not restore the link for what is not just their standard IDE, but a major reason for use of their language, an examplar of their close alliance/fusion with Java, and a vital resource to link especially given how impoverished the external links section was. (And I've never written a line of Scala in my life!)
Separately, the median number of watchlisters for the 100 pages you edited is 5.
Where is this figure coming from?
And we have no way to get the names of the watchlisters to see whether they are active. So for many of the pages, it seems plausible nobody even noticed that the link was removed. That is a separate issue unrelated to links.
If the community "exists" but is inactive, that's as bad as it not existing. Wikipedia is as Wikipedia does. Either way, the test is revealing.
On 5/31/12, Gwern Branwen gwern0@gmail.com wrote:
On Thu, May 31, 2012 at 8:31 AM, Carl (CBM) cbm.wikipedia@gmail.com wrote:
Separately, the median number of watchlisters for the 100 pages you edited is 5.
Where is this figure coming from?
Possibly some variant of this:
http://toolserver.org/~mzmcbride/watcher/
That was limited to pages less than 30, though, after some objections were raised. Possibly admins or others can see pages that have less than 30 watchers. Can't remember.
Carcharoth
On Thu, May 31, 2012 at 9:57 AM, Gwern Branwen gwern0@gmail.com wrote:
Separately, the median number of watchlisters for the 100 pages you edited is 5.
Where is this figure coming from?
There is a redacted (no user info) table in the toolserver database that can be used to count the number of editors who watchlist a page. I fetched the counts for the 100 articles and found the median.
- Carl
On Thu, May 31, 2012 at 11:08 AM, Carl (CBM) cbm.wikipedia@gmail.com wrote:
There is a redacted (no user info) table in the toolserver database that can be used to count the number of editors who watchlist a page. I fetched the counts for the 100 articles and found the median.
Ah. That's interesting to know and useful for context, thank you.
On Thu, May 31, 2012 at 11:59 AM, WereSpielChequers werespielchequers@gmail.com wrote:
Firstly rather than measure vandalism it created vandalism, and vandalism that didn't look like typical vandalism. Aside from the ethical issue involved, this will have skewed the result.
As I've said multiple times, this was a designed feature, and not a bug. The goal was not to measure the broadest possible kind of vandalism's reversion rate, as that has been amply studied*, but a specific kind. Complaining that this specific kind is 'skewed' compared to 'all possible vandalism related to external links' is to miss the point.
* Wikipedia generally does very well on *obvious* vandalism, and especially since the introduction of anti-vandalism bots with machine learning techniques. There's no need for anyone to spend time measuring it except perhaps bot-writers to finetune their statistics.
In particular the edit summaries were very atypical for vandalism, if I'd seen that edit summary on my watchlist I would probably have just sighed and taken it as another example of deletionism in action.
I propose a version of http://en.wikipedia.org/wiki/Poe%27s_law - it is impossible to create an example of deletionism mindless enough to be detectable as such if it comes with jargon attached.
Of the more than 13,000 pages on my watchlist I doubt there are 13 where I would look at such an edit, and that's if it was one of the changes on my watchlist that I was even aware of - it is far too big to fully check every day. Most IP vandals don't use jargon in edit summaries, and I know I'm not the only editor who is more suspicious of IP edits with blank edit summaries.
You only ran the experiment for one month. I often revert older vandalism than that, I may be unusual there in that I've got some tools for finding vandalism that has got past the hugglers, but I'm not unusual in sometimes taking articles back to the "last clean version".
You are unusual. When I was spending time reading academic publications on Wikipedia a few years ago, a number of them dealt with quantifying vandalism and reversions; almost all vandalism was reverted within days, and reversions which took longer than a month were very rare (0-10%, IIRC, to be very generous). This was why I chose to wait a month, because waiting longer added nothing. A week would have been adequate.
There are a number of related papers, but for brevity's sake take ftp://193.206.140.34/mirrors/epics-at-lnl/WikiDumps/localhost/group282-priedhorsky.pdf which found a exponential distribution for ordinary vandalism:
42% of damage incidents are repaired essentially immediately (i.e., within one estimated view). This result is roughly consistent with the work of Vi ́gas et al. [20], which showed that the median persistence of certain types of damage was 2.8 minutes. However, 11% of incidents persist beyond 100 views, 0.75% – 15,756 incidents – beyond 1000 views, and 0.06% – 1,260 incidents – beyond 10,000 views.
On average, the articles concerned had less than 100 page views a day going off stats.grok.se, so by just a few days, most of the edits should have been reverted - if they were going to be, of course. This sort of behavior is why you see such different averages and medians when you go looking in papers; eg
- ["Measuring Wikipedia"](http://eprints.rclis.org/bitstream/10760/6207/1/MeasuringWikipedia2005.pdf), Voss 2005 - ["Studying Cooperation and Conflict between Authors with history flow Visualizations"](http://alumni.media.mit.edu/~fviegas/papers/history_flow.pdf), Viégas et al 2003 - ["Detecting Wikipedia vandalism via spatio-temporal analysis of revision metadata?"](http://repository.upenn.edu/cgi/viewcontent.cgi?article=1963&context=cis...), West 2010 - ["User Contribution and Trust in Wikipedia"](http://www.ics.uci.edu/~sjavanma/CollabCom), Javanmardi et al - ["He says, she says: conflict and coordination in Wikipedia"](http://nguyendangbinh.org/Proceedings/CHI/2007/docs/p453.pdf), Kittur et al 2007
On Thu, May 31, 2012 at 12:03 PM, Thomas Morton morton.thomas@googlemail.com wrote:
This, I think, is a major issue which make the results useless
- The edit summary implies policy knowledge, I'd only check an edit like
that on my watchlist on occasion.
And deletionists have no policy knowledge?
On 5/31/12, Gwern Branwen gwern0@gmail.com wrote:
On average, the articles concerned had less than 100 page views a day going off stats.grok.se, so by just a few days, most of the edits should have been reverted - if they were going to be, of course.
This assumes that page views correspond to people reading the pages. I suspect that a lot of people viewing a page just scan briefly for what they are looking for (I typically use Ctl+F to find something if I am in a hurry), or realise they are in the wrong place and click away or click onwards through another link. There is no way of measuring the number of people that stop and carefully read a page as if they were sitting down to do some bedtime or leisure reading, as opposed to just looking up some factoid.
And deletionists have no policy knowledge?
Deletionists are not the monolithic body of people that you seem to think they are. Those with these tendencies (though I'm reluctant to lump people under a label) vary widely in their knowledge of policy, which should be no surprise.
I'm also puzzled by this view you have that removal of external links is a form of deletionism. I've always understood deletionism to be the removal of entire articles and restricting Wikipedia to a relatively narrow set of articles. Removal of content within articles is a completely different ballgame.
Carcharoth
On 1 June 2012 11:19, Carcharoth carcharothwp@googlemail.com wrote:
And deletionists have no policy knowledge?
Deletionists are not the monolithic body of people that you seem to think they are. Those with these tendencies (though I'm reluctant to lump people under a label) vary widely in their knowledge of policy, which should be no surprise.
I'm also puzzled by this view you have that removal of external links is a form of deletionism. I've always understood deletionism to be the removal of entire articles and restricting Wikipedia to a relatively narrow set of articles. Removal of content within articles is a completely different ballgame.
Gah. WP really needs the tension between quality and quantity to be expressed by a two-party system like it needs a hole in the head. And it needs deletion debates whose length is greatest where the outcome matters least (i.e. the indifference point for inclusion) like several more. Further, people who think "knowledge of policy" amounts to knowing the letter of the law are a menace, as are people who think detailed policies are there to help them win arguments, rather than for the general good of the project (it being easier to prove your point if you assume what you want to prove at the outset).
Charles
On Fri, Jun 1, 2012 at 6:19 AM, Carcharoth carcharothwp@googlemail.com wrote:
This assumes that page views correspond to people reading the pages. I suspect that a lot of people viewing a page just scan briefly for what they are looking for (I typically use Ctl+F to find something if I am in a hurry), or realise they are in the wrong place and click away or click onwards through another link. There is no way of measuring the number of people that stop and carefully read a page as if they were sitting down to do some bedtime or leisure reading, as opposed to just looking up some factoid.
I'm sure the numbers are false, but numbers are always false. You make points which are equally true of any article's statistics on stats.grok.se (including the most popular ones), and this overestimation is counterbalanced by the many forms of *under*estimation going into the stats.grok.se numbers, like not counting page views on any mirrors at all. Unless you have a reason to think that the net error, inclusive of all these sources, leads to overestimation, pointing out the possible error is a bit sophomoric.
On Fri, Jun 1, 2012 at 10:15 AM, Gwern Branwen gwern0@gmail.com wrote:
On Fri, Jun 1, 2012 at 6:19 AM, Carcharoth carcharothwp@googlemail.com wrote:
This assumes that page views correspond to people reading the pages. I suspect that a lot of people viewing a page just scan briefly for what they are looking for (I typically use Ctl+F to find something if I am in a hurry), or realise they are in the wrong place and click away or click onwards through another link. There is no way of measuring the number of people that stop and carefully read a page as if they were sitting down to do some bedtime or leisure reading, as opposed to just looking up some factoid.
I'm sure the numbers are false, but numbers are always false. You make points which are equally true of any article's statistics on stats.grok.se (including the most popular ones), and this overestimation is counterbalanced by the many forms of *under*estimation going into the stats.grok.se numbers, like not counting page views on any mirrors at all. Unless you have a reason to think that the net error, inclusive of all these sources, leads to overestimation, pointing out the possible error is a bit sophomoric.
--
In which a discussion is repeatedly reframed as a confrontation, for no apparent reason and to no apparent benefit.
On Wed, May 30, 2012 at 2:33 PM, Gwern Branwen gwern0@gmail.com wrote:
My experiment has concluded and all the link removals reverted*. The full writeup is at http://www.gwern.net/In%20Defense%20Of%20Inclusionism#sins-of-omission-exper...
Result: Of the 100 removals, just 3 were reverted.
3% is even lower than I expected, and very different from Horologium's estimate, incidentally.
Today I did a followup at the 1 month point, hand-checking the 100 links I restored to articles while cleaning up the experiment. Of the 100, 4 do not appear in the current version of the article.
(2 of the removals were in direct response to the restoration, while the other 2 are either unexplained and part of a large edit with many changes or got removed in a wholesale culling of the External links section.)
Those who think that 3% was the correct reversion rate for the removals are invited to explain how 4% could be the correct reversion rate for the re-adding of the same links - if it was acceptable for 97% to be removed in the first place, how could it also be acceptable for 94% to then be restored?