[Wikipedia-l] Project: This wikipedia-related article is a stub...
Tomasz Wegrzanowski
taw at users.sf.net
Tue Sep 6 22:49:11 UTC 2005
On Tue, Sep 06, 2005 at 11:52:21PM +0200, Lars Aronsson wrote:
> Paweł Dembowski wrote:
> > It seems to me that Swedish Wikipedia is quite the opposite - they
> > have over 100,000 articles mostly because of the huge amount of
> > substubs...
>
> I agree that this is embarrasing and should be addressed. I think
> that the Danish Wikipedia, with 30,000 articles, has an even
> higher percentage of (sub-)stubs than the Swedish one, but this is
> just a feeling and I have no numbers to prove this. We need a
> statistic for the amount of (sub-)stubs, so we can talk verifiable
> numbers (and set goals) instead of guestimates. How do we define
> that? Is the ">200 ch" count ("alternative" article count, [1])
> in Erik Zachte's Wikistats a good metric? Or the percentage of
> articles longer than 0.5 kilobytes [2]? I think 200 characters is
> an OK stub, but perhaps a substub is less than 70 characters?
> This leaves us with the Special:Shortpages page. That page has
> the advantage of being instantly updated, which Wikistats is not.
>
> The Swedish Wikipedia has 421 articles (0.4% of 102K) shorter than
> 70 bytes and the Danish has 351 (1.1% of 31K). As a comparison,
> the Dutch Wikipedia has 79 (0.08% of 89K) and the Polish has 387
> (0.4% of 93K). This makes the Polish look just as bad as the
> Swedish, since both have 0.4% of articles shorter than 70 bytes.
> But perhaps a substub should be defined at 50 bytes instead?
> Or 100 bytes or 150?
Numbers like 0.4% of articles tell more about effectiveness
of the wikicleaning process than about the typical article.
(and by the way, Special:Shortpages is not updated live
on WikiMedia servers)
Just take a look at the list of shortest pages on Polish
Wikipedia - they're almost all:
* Redirects (what are they doing on the list ?)
* Disambiguation pages without descriptions for the links.
Sometimes articles have titles so obvious that {{disambig}} +
list of the links is enough.
* A few cases of things that look like leftovers of the
past technical problems
* A few cases of things that should be immediately deteled,
but have been missed or are simply too recent and will
be deleted soon
More information about the Wikipedia-l
mailing list