Hi everyone,
Many thanks for the responses so far. I’m going through the links that Tilman and Isaac
provided.
Here is some more background on what I’m trying to accomplish (I’m realizing that more
background usually helps). I have two projects going on: One is that later this month I’ll
be doing a short presentation at the Misinfocon conference, as part of a panel discussion
on quality at Wikipedia. The other project is that I’m writing a book for a general
audience about how the English Wikipedia works in its processes and culture. I’ll be happy
to talk more about this offline if anyone is interested.
Both of these projects are very general in scope so I’m trying to rely on existing
research as much as possible rather than conducting new primary research. I’d like to give
a sense of *approximately* how many good, bad, and controversial edits the English
Wikipedia gets. I’m not looking for perfect metrics, just ones that I can explain. E.g.
the percentage of edits that machines can classify as being reverted is one possible
metric of how many edits are considered to be bad by someone. I can explain that this
might undercount the actual figure because humans might partially revert or fully revert
an edit in a way that’s not machine-detectable.
I found the answer to my question #5 through a Quarry query (I love that site!). In 2019,
edit filters disallowed 581,120 attempted edits to the English Wikipedia, which is around
one disallow per minute and totals nearly 1% of all enwiki edits. If we assume all
disallowed edits are vandalism, and 2.5% of successful edits are vandalism, then around
3.5% of all attempted edits are vandalism and 29% of these attempts are disallowed by edit
filters.
Cheers,
Su-Laine
On Feb 4, 2020, at 1:47 PM, Ziko van Dijk
<zvandijk(a)gmail.com> wrote:
Hello Sue-Laine,
Interesting, I am very much looking forward to your results/paper.
Allow me a note on „reverts“. I am not sure which is the exact metholody
you want to use, and what is your approach / field in general. It comes to
my mind that a good definition of revert is needed. Technically, a revert
means that you re-install a previous page version (I guess). But sometimes,
also in the technical dimension, this is done by the „revert“ function (or
the revert function that enables a comment), and sometimes „manually“ by
creating a new version with old content.
Sometimes, the revert is a full revert, sometimes a partial revert.
Sometimes, the old version is text A, the new version is text B, and then
the „revert“ actually is a version with text A‘ or B‘ or C (the apostroph
in my writing means: similar to).
Also, what about reverting yourself? With what motive exactly?
If I am correct you have mentioned some examples dealing with the reason
for deletion. That is an important approach too, of course. It would be
another step to consider the consequences of a revert in the social
dimension. So how does a revert afflict the social relationship between the
editors involved. And how is the general atmosphere on the wiki afflicted.
Here some thought, maybe useful or not. :-)
Kind regards
Ziko
Tilman Bayer <haebwiki(a)gmail.com> schrieb am Sa. 1. Feb. 2020 um 03:25:
Concerning 1) and about analyzing reverts in
general, see
https://meta.wikimedia.org/wiki/Research:Revert .
To explore 5),
https://meta.wikimedia.org/wiki/AbuseFilter and
https://tools.wmflabs.org/ptwikis/Filters:enwiki may be of interest.
Regards, HaeB
On Wed, Jan 29, 2020 at 12:01 PM Su-Laine Brodsky <sulainey(a)gmail.com>
wrote:
Hi everyone,
I’m looking for statistics about the edits that are reverted on the
English Wikipedia. This is for purposes of explaining to the public what
Wikipedia’s quality control processes are like. If hard numbers aren’t
available, I’m also interested in educated guesstimates.
1) An often-quoted statistic is that 7% of edits are reverted. Is this
still believed to be true?
2) According to
https://blog.wikimedia.org/2017/07/19/scoring-platform-team/, 2.5% of
edits are vandalism. There are other common reasons for reverting, and
I’m
wondering if anyone has studied their frequency.
Does anyone know what
percentage of all edits are reverted for being:
a) Spam (as perceived by the reverter)
b) Copyright violation
c) Violations of the Biographies of Living Persons policy
3) Do statistics on the number of edits per day on the English Wikipedia
(i.e. 164,000 edits per day) include edits that are blocked by the spam
blacklists or by edit filters?
4) How many edits per day on the English Wikiepdia are prevented
(blocked)
by the spam blacklists?
5) How many edits per day on the English Wikiepdia are prevented by the
edit filters?
6) What percentage of all reverts are made by users of Huggle and Stiki?
7) What proportion of vandalism is quickly reverted? A 2007 study
(Priedhorsky et al) found that 42% of vandalistic contributions are
repaired within one view and 70% within ten views - have any newer
studies
been done on this?
Thanks in advance!
Su-Laine
Vancouver, BC
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l