[Wikipedia-l] Project: This wikipedia-related article is a stub...

Tomasz Wegrzanowski taw at users.sf.net
Tue Sep 6 22:49:11 UTC 2005


On Tue, Sep 06, 2005 at 11:52:21PM +0200, Lars Aronsson wrote:
> Paweł Dembowski wrote:
> > It seems to me that Swedish Wikipedia is quite the opposite - they
> > have over 100,000 articles mostly because of the huge amount of
> > substubs...
> 
> I agree that this is embarrasing and should be addressed. I think 
> that the Danish Wikipedia, with 30,000 articles, has an even 
> higher percentage of (sub-)stubs than the Swedish one, but this is 
> just a feeling and I have no numbers to prove this.  We need a 
> statistic for the amount of (sub-)stubs, so we can talk verifiable 
> numbers (and set goals) instead of guestimates.  How do we define 
> that?  Is the ">200 ch" count ("alternative" article count, [1]) 
> in Erik Zachte's Wikistats a good metric?  Or the percentage of 
> articles longer than 0.5 kilobytes [2]?  I think 200 characters is 
> an OK stub, but perhaps a substub is less than 70 characters?
> This leaves us with the Special:Shortpages page.  That page has 
> the advantage of being instantly updated, which Wikistats is not.
> 
> The Swedish Wikipedia has 421 articles (0.4% of 102K) shorter than 
> 70 bytes and the Danish has 351 (1.1% of 31K).  As a comparison, 
> the Dutch Wikipedia has 79 (0.08% of 89K) and the Polish has 387 
> (0.4% of 93K).  This makes the Polish look just as bad as the 
> Swedish, since both have 0.4% of articles shorter than 70 bytes.
> But perhaps a substub should be defined at 50 bytes instead?
> Or 100 bytes or 150?

Numbers like 0.4% of articles tell more about effectiveness
of the wikicleaning process than about the typical article.
(and by the way, Special:Shortpages is not updated live
on WikiMedia servers)

Just take a look at the list of shortest pages on Polish
Wikipedia - they're almost all:
* Redirects (what are they doing on the list ?)
* Disambiguation pages without descriptions for the links.
  Sometimes articles have titles so obvious that {{disambig}} +
  list of the links is enough.
* A few cases of things that look like leftovers of the
  past technical problems
* A few cases of things that should be immediately deteled,
  but have been missed or are simply too recent and will
  be deleted soon



More information about the Wikipedia-l mailing list