[WikiEN-l] What proportion of articles are stubs?

WereSpielChequers werespielchequers at gmail.com
Sat Dec 4 11:47:19 UTC 2010


Two things that lead me to suspect our proportion of stubs may be
slowly falling:

1 The size of the database in gigabytes has been growing faster than
the the number of articles

2 Even though our total number of articles is still slowly increasing
and will probably soon exceed 3.5 million, if we look at the stats -
http://stats.wikimedia.org/EN/TablesWikipediaEN.htm#editdistribution
The number of bytes of text is steadily rising and the percentage of
shorter articles is steadily falling - take the 512 byte threshold. In
Jan 2007 18.8% of articles were shorter than this,  by Jan this year
it was down to 11.3%.

WereSpielChequers

On 29 November 2010 19:15, Andrew Gray <andrew.gray at dunelm.org.uk> wrote:
> On 29 November 2010 17:33, Charles Matthews
> <charles.r.matthews at ntlworld.com> wrote:
>> Stubs and how to handle them seem to be controversial still (or again),
>> which is rather surprising given that we have been going nearly a decade
>> now. I'd like to ask how many articles still are stubs, by some sensible
>> standard?
>
> Currently, 73% of enwp articles have some form of quality assessment.
> 13% have the "infrastructure" for assessment - talkpage templates -
> but no rating as yet; the remaining 14% are entirely unknown to the
> assessment system.
>
> Of the assessed articles, two thirds are rated as stubs.
>
> However, there's a massive great caveat to that: an awful lot of them
> aren't. Based on my experience, I'd say anything from a quarter to a
> half of the "stub" articles are not, by any reasonable definition,
> stubs. It's not uncommon now to see a multiple-paragraph article with
> an infobox, image and external links - lacking in many aspects of its
> coverage, no doubt, but a nontrivial amount of content - labelled as a
> stub.
>
> There's three factors at work here.
>
> a) Redefinition: As our standards grow higher, "stub" gets repurposed
> as a catch-all term for "very low-quality article"
> b) Lag: articles being marked as stubs, then expanding, but the tag
> not being removed (or removed from the talkpage and not from the
> rating template).
> c) Drift: people see the articles marked as stub in a) and b), and
> assume this is what one should be like, so grade accordingly.
>
> Overall, using the traditional definition of "short placeholder
> article providing a basic degree of context", the sort of thing you
> might perhaps find in a concise reference work - I'd say ~50% of our
> articles. I *think* the proportion of stubs created now is less than
> the proportion created in, say, 2006, but I don't have much evidence
> to back that up.
>
> --
> - Andrew Gray
>   andrew.gray at dunelm.org.uk
>
> _______________________________________________
> WikiEN-l mailing list
> WikiEN-l at lists.wikimedia.org
> To unsubscribe from this mailing list, visit:
> https://lists.wikimedia.org/mailman/listinfo/wikien-l
>



More information about the WikiEN-l mailing list