[WikiEN-l] What proportion of articles are stubs?
Carl (CBM)
cbm.wikipedia at gmail.com
Mon Nov 29 21:16:38 UTC 2010
On Mon, Nov 29, 2010 at 12:33 PM, Charles Matthews
<charles.r.matthews at ntlworld.com> wrote:
> Stubs and how to handle them seem to be controversial still (or again),
> which is rather surprising given that we have been going nearly a decade
> now. I'd like to ask how many articles still are stubs, by some sensible
> standard?
The following data is from the live toolserver database just now.
This is not a very detailed standard for counting the number of stubs,
but at least it's objective.
There are 3,517,730 non-redirect pages in the main namespace. Of
these, 3,144,982 are less then 10,000 bytes; 2,596,291 are less than
5,000 bytes; 1,422,480 are less than 2,000 bytes; 547,342 are less
than 1,000 bytes; and 185,932 are less than 500 bytes. There are about
186,000 pages in [[Category:All disambiguation pages]], which are
included in the above numbers. Redirects are *not* included.
If we estimate 4.5 bytes per word plus another byte for a space, a
1,000 byte article would have 182 words (ignoring templates and
categories), and a 5,000 byte article would have about 910 words.
I think it's safe to say that the majority of our articles are "short"
and a significant minority are "very short".
- Carl
More information about the WikiEN-l
mailing list