[Wikimedia-l] article bytes more meaningful than users or revisions (was Re: Updates on VE data analysis)

John Vandenberg jayvdb at gmail.com
Sat Jul 27 10:54:43 UTC 2013


On Sat, Jul 27, 2013 at 6:29 PM, Denny Vrandečić
<denny.vrandecic at wikimedia.de> wrote:
> Thank you for the observation.
>
> Is the graph <http://i.imgur.com/TfaD99V.png> based on actual data? Because
> it looks just tad bit too linear to me. (I do not disagree with the
> finding, just wondering about the graph itself).
>
> I still would worry, though: our content is increasing linearly, as you
> say, but the number of active contributors is not. If we take for granted
> that active contributors are the ones who provide quality control for the
> articles, this means that since 2006 or so the ratio of content per
> contributor is linearly declining, which would mean that our quality would
> suffer.

There are a few parts of this that I dont think it can be taken for
granted, and I would love to see stats about quality rather than
quantity, as you're talking about quality, and that should be a
significant component of our analysis.

1) 'active contributors are the ones who provide quality control'

   bots do a lot of what used to be done by humans back in 2007,
rolling back most silly edits.
   and it is a small subset of active contributors who do the majority
of the maintenance.

2) the number of active contributors _doing quality control_ has declined.

   we know the number of overall editors is declining, and I think you
are right that those doing quality control is declining, but is there
evidence to support it?  And does it support that this decline is a
problem?

My gut feeling is that the decline in 'quality control' edits is
tightly linked to the increase in bots doing quality control.

i.e. do we have research to support total article-to-editor ratio
having a bearing on average quality of content?
A proxy could be average number of references per article ..?

It seems unlikely, as our content over the last five years has
increased in quality, and our number of editors has declined.

> I see two effects to counter that:
>
> 1) as you already mentioned, contributors are getting increasingly more
> experienced and more effective in fulfilling their tasks.
>
> 2) we continue to have a strong increase in readers and even stronger in
> pageviews (i.e. more and more people consult Wikipedia more and more). They
> probably also provide a layer of quality assurance, even though they might
> not qualify to be counted as active contributors.
>
> I have the gut feeling that 1) cannot be sufficient, and I would be curious
> in the effects of 2) - especially considering that much of the Foundation
> development work can be considered in improving 2 further (visual editor,
> article rating, mobile editing, etc.)

I agree with James that (1) is significant, and (2 - 'the future')
brings many unknowns with it.

(1) consists of our entire potential editor base, which includes of
all our currently active editors, and all of our inactive editors who
are able to resume editing at any time - i.e. not blocked, not ^&%ed
off, etc.  They all know the syntax, and have demonstrated their
commitment to the vision, _and_ the writers have a personal connection
to the articles that they worked on.  I see lots of them come back
occasionally to touch up or expand their work.

(2) brings different editors, for good or ill.  There are some
concerns in the community that simplifying editing will bring more
non-trivial vandalism that bots cant handle, and even more good
meaning editors who are discouraged when they can't understand why
their edit has disappeared, because they dont read the history, the
talk pages, etc, etc.  The ratio of experienced editor to newbie could
be a significant factor in the maintenance of a friendly environment.

More is not always better.

Don't get me wrong; a good VE will be very helpful, and the projects
defensive mechanisms will adapt.  But I predict that if we see lots of
poor quality articles from VE, without adequate references, and the
community backlogs become problematic, the community will want develop
tools to limit new poor quality articles.

Does anyone have stats for the number of blocked users per month over
the years, as that is hurting our potential editor base, and number of
reverts of edits by new users.

--
John Vandenberg



More information about the Wikimedia-l mailing list