[Wikimedia-l] article bytes more meaningful than users or revisions (was Re: Updates on VE data analysis)

Denny Vrandečić denny.vrandecic at wikimedia.de
Sat Jul 27 08:29:58 UTC 2013


Thank you for the observation.

Is the graph <http://i.imgur.com/TfaD99V.png> based on actual data? Because
it looks just tad bit too linear to me. (I do not disagree with the
finding, just wondering about the graph itself).

I still would worry, though: our content is increasing linearly, as you
say, but the number of active contributors is not. If we take for granted
that active contributors are the ones who provide quality control for the
articles, this means that since 2006 or so the ratio of content per
contributor is linearly declining, which would mean that our quality would
suffer.

I see two effects to counter that:

1) as you already mentioned, contributors are getting increasingly more
experienced and more effective in fulfilling their tasks.

2) we continue to have a strong increase in readers and even stronger in
pageviews (i.e. more and more people consult Wikipedia more and more). They
probably also provide a layer of quality assurance, even though they might
not qualify to be counted as active contributors.

I have the gut feeling that 1) cannot be sufficient, and I would be curious
in the effects of 2) - especially considering that much of the Foundation
development work can be considered in improving 2 further (visual editor,
article rating, mobile editing, etc.)





2013/7/27 James Salsman <jsalsman at gmail.com>

> MZMcBride wrote:
> >... the number of non-deleted revisions per day for the
> > English Wikipedia. The results are here:
> > https://en.wikipedia.org/wiki/Special:Permalink/565971356
>
> So, that looks terrible: http://i.imgur.com/Z9lYCWj.png
>
> It looks terrible in the same way that every other graph of active
> users and several other related measures look like.
>
> But it isn't. It doesn't account for the power law of practice which
> causes everyone who has ever edited Wikipedia to get better at it with
> time. And since so many IP editors are obviously returning, that means
> a lot more than under the false but very common assumption that every
> IP editor is new.
>
> Here's what really matters, articlespace size:
> http://i.imgur.com/TfaD99V.png
>
> The size of the article text in bytes has been marching on linearly
> since the beginning of Wikipedia, with extremely low variation, just
> like the short popular vital articles and every other measure of
> quality content.
>
> There is no legitimate basis to worry about anything until the linear
> trend of the total article bytes breaks out of its 12 year linear
> trend.
>
> (If you multiply columns 'E' and 'I' from
> http://stats.wikimedia.org/EN/TablesWikipediaEN.htm the database size
> shows a cusp at around 2006, corresponding to the growth modes, but
> two separate linear trends fit both modes far better than any growth
> model fits the entire curve.)
>
> _______________________________________________
> Wikimedia-l mailing list
> Wikimedia-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request at lists.wikimedia.org?subject=unsubscribe>




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.


More information about the Wikimedia-l mailing list