[WikiEN-l] Looking for thoughts on statistics

Carcharoth carcharothwp at googlemail.com
Mon Mar 29 01:42:57 UTC 2010


On Sun, Mar 28, 2010 at 11:16 PM, Ian Woollard <ian.woollard at gmail.com> wrote:

<snip>

> The idea comes from a mixture of looking at the statistics peak and
> looking at the articles that still are needed. Nearly all of the
> low-hanging fruit is clearly gone now. Most of the mid-hanging fruit
> is also now gone. We're getting towards the top of the tree, things
> are getting more obscure. This is a *good* thing, not having so many
> holes in the Wikipedia!

But are the holes small ones to be filled in with only a few articles
or large ones that will take a large number of articles to fill in. Or
to put that another way, is the number of "obscure" articles much
larger than the number of "obvious" articles. Or to use your analogy,
are there more fruit at the top of the tree than at the bottom?

Carcharoth wrote:
>> My view is that the rate of article creation and the number of
>> "missing" articles depends *heavily* on the topic area. Some topic
>> areas are very well covered, others are not so well covered. In the
>> former areas, you will indeed struggle to find new articles to create,
>> but there are some areas (history in particular) where there are
>> thousands (probably tens of thousands) of articles still needed.

Ian Woollard wrote:
> I'm sure you're correct. So if there's twenty or thirty other similar
> areas, then we're looking at a under a million articles left to write.
> We're currently at 3.2 million. I think we'll exceed 4 million within
> a few years.

My view is that the growth can continue almost indefinitely, but the
rate slows as the obvious articles get written. That doesn't mean that
there is a natural limit, just that future growth will take a long
time (maybe forever) given that the articles to be written require
ever increasing amounts of specialist knowledge (unless you redirect
efforts towards improving existing articles and strictly enforce which
articles take priority - e.g. getting the core articles to a good
state before writing more articles on obscure topics).

<snip>

>> A better approach would be to look at samples of article creation and
>> see what articles are being created and that will give you an idea of
>> where the gaps are being filled in and hence how big the gaps are.
>
> This IS the point though; we're now looking for the gaps. That's
> exactly what I'm saying. The Wikipedia should more or less run out of
> gaps in about 3 years (ish- but it's never going to completely run
> out, but growth from existing knowledge will be progressively slower
> and slower). OTOH the circle of knowledge is still growing, at a
> somewhat slower rate.

I meant actually looking at actual articles created and seeing whether
they are really as obscure as you think, rather than generalising. :-)
If your hypothesis is correct, the articles being created should
increasingly be obscure ones being created by experienced Wikipedians.
Have you looked to see if that is what is actually happening?

Carcharoth



More information about the WikiEN-l mailing list