[WikiEN-l] What proportion of articles are stubs?

Carcharoth carcharothwp at googlemail.com
Mon Nov 29 21:22:19 UTC 2010


On Mon, Nov 29, 2010 at 9:16 PM, Carl (CBM) <cbm.wikipedia at gmail.com> wrote:

> I think it's safe to say that the majority of our articles are "short"
> and a significant minority are "very short".

Is it possible to have a breakdown of the high-end of that? i.e.
Number of articles from 10,000 bytes upwards in steps of 5,000 bytes?
(I forget what the size of the largest article is). Also, have you
looked at the byte size and word count of some actual articles, to see
how accurate your "4.5-bytes-per-word" estimate is?

Carcharoth



More information about the WikiEN-l mailing list