[Foundation-l] [Wikipedia-l] school articles : enough

Lars Aronsson lars at aronsson.se
Thu Jan 25 12:01:45 UTC 2007

David Monniaux wrote:

> On the English Wikipedia (but this is coming on other  ones) we have a
> large amount of articles about individual highschools, most of which
> have nothing special and are just like the next highschool.
> These articles tend: * to lack perspective
> [...]
> However, when OTRS folks delete such articles as "non notable", they
> often face angry remarks, accusations of lack of democratic process,

I'm not interested in schools or whether they are worthy of 
articles, but I'm intrigued by the mathematical nature of this 

The people who wrote the articles lack perspective (on other 
schools than their own) and when the article is removed, they lack 
perspective of having articles removed. Aren't these necessary 
phenomena at the thin end of [[the long tail]]?

If we had complete visitor statistics from web logs (including 
Squid caches and reusers such as Answers.com), then we could point 
to numbers saying that this article has only been viewed so many 
times in the last year, and therefore it is not notable.  But even 
if this were practically achievable (which today it is not), would 
that be a useful solution?

All classic reasoning about notability is focused on the fat end 
of the tail.  Oscars are awarded to the best films, bookstores 
list the best selling books, the winners get the prizes.  But how 
can we achieve fairness, balance, equal coverage at the thin end?

In any written text (see [[en:Zipf's law]]), of all the words used 
(the vocabulary), about half of them will occurr only once.  If 
the same mathematical distribution is applicable to topics in an 
encyclopedia, about half of all articles in Wikipedia are at the 
very thinnest end of the tail.  If we were to use visitor 
statistics to cut away the least notable topics, we could easily 
cut away half of our stock.  And that's hardly what we want.

So is there any other math we could do here?

  Lars Aronsson (lars at aronsson.se)
  Aronsson Datateknik - http://aronsson.se

More information about the foundation-l mailing list