It would be interesting to have some coarse characterisation of edits to see if any growth
in edit count is spread uniformly against all contribution types or if the growth is
disproportionate some way. I would suspect that the change in the length of the article is
probably a poor man’s approximation for the nature of the edit. Using the “generalisation
from single example” method :) I took a look at my own recent contributions. As a rough
characterisation ….
An increase of over 200 bytes seems to equate to adding content in the form of new
sentences, so likely to be new facts. And most edits in 100-200 extra bytes are content
related (or at least added citations).
Adding under 100 bytes seems to be more “housekeeping” of existing content. Nothing
factually new, but I might be adding a section header, some wikilinks, copyediting, adding
categories, etc
Reductions in a small number of bytes 0-50 is most likely copyediting.
Reductions by more than 50 bytes is usually deleting content (although it might be part of
moving/merging process in which the content is actually preserved elsewhere, as I use
section editing a lot in the source editor). Not being a deletionist, my larger deletions
(where my intention is to remove the content entirely from WP) are usually pretty blatant
vandalism or nonsense. Generally if I sense good faith, I try to see if I can fix it up
rather than just chuck it out. As section blanking etc is usually dealt with by ClueBot
and similar, I am rarely needing to restore large amounts of inexplicably deleted
content.
The above comments relate to articles rather than talk pages where different patterns
apply.
So I’d be curious to know if there’s any change to the proportion of (say) 200+ byte
additions to articles (not talk, etc) over time, as I think that’s a reasonable indicator
of new content rather than the maintenance of existing content.
From: wiki-research-l-bounces(a)lists.wikimedia.org
[mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Jonathan Morgan
Sent: Tuesday, 25 August 2015 2:48 AM
To: Research into Wikimedia content and communities
<wiki-research-l(a)lists.wikimedia.org>
Subject: Re: [Wiki-research-l] Has the recent increase in English wikipedia's core
community gone beyond a statistical blip?
I don't think Jonathan was saying we should buy a full page adin the NYT and declare
editor retention solved. I share his cautious optimism. The rate of the editor decline has
decreased along several metrics, and we're seeing an intriguing uptick in 100+ editor
activity.
Back in 2011, when he and I (and several others on this list) were participating in the
Summer of Research, the month-over-month metrics were decreasing at a rate that was kind
of alarming. Some combination of factors seems to have changed that pattern. Worth looking
into.
J
On Mon, Aug 24, 2015 at 9:35 AM, Oliver Keyes <okeyes(a)wikimedia.org
<mailto:okeyes@wikimedia.org> > wrote:
"Until we can prove it is good data we should treat it as good data"
is not how data works.
Absent exactly that analysis it is almost certainly a bad idea for us
to declare this to be good news; validate, /then/ celebrate.
On 24 August 2015 at 12:26, WereSpielChequers
<werespielchequers(a)gmail.com <mailto:werespielchequers@gmail.com> > wrote:
100 edits a month does indeed have the disadvantage
that all edits are not
equal, there may be some people for whom that represents 100 hours
contributed, others a single hour. So an individual month could be inflated
by something as trivial as a vandalfighting bot going down for a couple of
days and a bunch of oldtimers responding to a call on IRC by coming back and
running huggle for an hour.
But 7 months in a row where the total is higher than the same month the
previous year looks to me like a pattern.
Across the 3,000 or so editors on English wikipedia who contribute over a
hundred edits per month there could be a hidden pattern of an increase in
Huggle, stiki and AWB users more than offsetting a decline in manual
editing, but unless anyone analyses that and reruns those stats on some
metric such as "unique calender hours in which someone saves an edit" I
think it best to treat this as an imperfect indicator of community health.
I'm not suggesting that we are out of the woods - there are other indicators
that are still looking bad, and I would love to see a better proxy for
active editors. But this is good news.
On 23 August 2015 at 19:31, Mark J. Nelson <mjn(a)anadrome.org
<mailto:mjn@anadrome.org> > wrote:
WereSpielChequers <werespielchequers(a)gmail.com
<mailto:werespielchequers@gmail.com> > writes:
Could you be more specific re "In general
I'm not sure the 100+ count is
among the most reliable." What in particular do you think is unreliable
about that metric?
The main thing I have questions about with that metric is whether it's a
good proxy for editing activity in general, or is dominated by
fluctuations in "bookkeeping" contributions, i.e. people doing
mass-moves of categories and that kind of thing (which makes it quite
easy to get to 100 edits). This has long been a complaint about edit
counts as a metric, which have never really been solidly validated.
Looking through my own personal editing history, it looks like there's
an anti-correlation between hitting the 100-edit threshold and making
more substantial edits. In months when I work on article-writing I
typically have only 20-30 edits, because each edit takes a lot of
library research, so I can't make more than one or two a day. In months
where I do more bookkeeping-type edits I can easily have 500 or 1000
edits.
But that's just for me; it's certainly possible that Wikipedia-wide,
there's a good correlation between raw edit count and other kinds of
desirable activity measures. But is there evidence of that?
--
Mark J. Nelson
Anadrome Research
http://www.kmjn.org
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Oliver Keyes
Count Logula
Wikimedia Foundation
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>