Hi everyone,
Many thanks for the responses so far. I’m going through the links that Tilman and Isaac provided.
Here is some more background on what I’m trying to accomplish (I’m realizing that more background usually helps). I have two projects going on: One is that later this month I’ll be doing a short presentation at the Misinfocon conference, as part of a panel discussion on quality at Wikipedia. The other project is that I’m writing a book for a general audience about how the English Wikipedia works in its processes and culture. I’ll be happy to talk more about this offline if anyone is interested.
Both of these projects are very general in scope so I’m trying to rely on existing research as much as possible rather than conducting new primary research. I’d like to give a sense of *approximately* how many good, bad, and controversial edits the English Wikipedia gets. I’m not looking for perfect metrics, just ones that I can explain. E.g. the percentage of edits that machines can classify as being reverted is one possible metric of how many edits are considered to be bad by someone. I can explain that this might undercount the actual figure because humans might partially revert or fully revert an edit in a way that’s not machine-detectable.
I found the answer to my question #5 through a Quarry query (I love that site!). In 2019, edit filters disallowed 581,120 attempted edits to the English Wikipedia, which is around one disallow per minute and totals nearly 1% of all enwiki edits. If we assume all disallowed edits are vandalism, and 2.5% of successful edits are vandalism, then around 3.5% of all attempted edits are vandalism and 29% of these attempts are disallowed by edit filters.
Cheers, Su-Laine
On Feb 4, 2020, at 1:47 PM, Ziko van Dijk zvandijk@gmail.com wrote:
Hello Sue-Laine,
Interesting, I am very much looking forward to your results/paper.
Allow me a note on „reverts“. I am not sure which is the exact metholody you want to use, and what is your approach / field in general. It comes to my mind that a good definition of revert is needed. Technically, a revert means that you re-install a previous page version (I guess). But sometimes, also in the technical dimension, this is done by the „revert“ function (or the revert function that enables a comment), and sometimes „manually“ by creating a new version with old content.
Sometimes, the revert is a full revert, sometimes a partial revert. Sometimes, the old version is text A, the new version is text B, and then the „revert“ actually is a version with text A‘ or B‘ or C (the apostroph in my writing means: similar to).
Also, what about reverting yourself? With what motive exactly?
If I am correct you have mentioned some examples dealing with the reason for deletion. That is an important approach too, of course. It would be another step to consider the consequences of a revert in the social dimension. So how does a revert afflict the social relationship between the editors involved. And how is the general atmosphere on the wiki afflicted.
Here some thought, maybe useful or not. :-)
Kind regards Ziko
Tilman Bayer haebwiki@gmail.com schrieb am Sa. 1. Feb. 2020 um 03:25:
Concerning 1) and about analyzing reverts in general, see https://meta.wikimedia.org/wiki/Research:Revert .
To explore 5), https://meta.wikimedia.org/wiki/AbuseFilter and https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.
Regards, HaeB
On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky sulainey@gmail.com wrote:
Hi everyone,
I’m looking for statistics about the edits that are reverted on the English Wikipedia. This is for purposes of explaining to the public what Wikipedia’s quality control processes are like. If hard numbers aren’t available, I’m also interested in educated guesstimates.
- An often-quoted statistic is that 7% of edits are reverted. Is this
still believed to be true?
- According to
https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of edits are vandalism. There are other common reasons for reverting, and
I’m
wondering if anyone has studied their frequency. Does anyone know what percentage of all edits are reverted for being: a) Spam (as perceived by the reverter) b) Copyright violation c) Violations of the Biographies of Living Persons policy
- Do statistics on the number of edits per day on the English Wikipedia
(i.e. 164,000 edits per day) include edits that are blocked by the spam blacklists or by edit filters?
- How many edits per day on the English Wikiepdia are prevented
(blocked)
by the spam blacklists?
- How many edits per day on the English Wikiepdia are prevented by the
edit filters?
What percentage of all reverts are made by users of Huggle and Stiki?
What proportion of vandalism is quickly reverted? A 2007 study
(Priedhorsky et al) found that 42% of vandalistic contributions are repaired within one view and 70% within ten views - have any newer
studies
been done on this?
Thanks in advance!
Su-Laine Vancouver, BC
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l