David Monniaux wrote:
On the English Wikipedia (but this is coming on other
ones) we have a
large amount of articles about individual highschools, most of which
have nothing special and are just like the next highschool.
These articles tend: * to lack perspective
However, when OTRS folks delete such articles as "non notable", they
often face angry remarks, accusations of lack of democratic process,
I'm not interested in schools or whether they are worthy of
articles, but I'm intrigued by the mathematical nature of this
The people who wrote the articles lack perspective (on other
schools than their own) and when the article is removed, they lack
perspective of having articles removed. Aren't these necessary
phenomena at the thin end of [[the long tail]]?
If we had complete visitor statistics from web logs (including
Squid caches and reusers such as Answers.com
), then we could point
to numbers saying that this article has only been viewed so many
times in the last year, and therefore it is not notable. But even
if this were practically achievable (which today it is not), would
that be a useful solution?
All classic reasoning about notability is focused on the fat end
of the tail. Oscars are awarded to the best films, bookstores
list the best selling books, the winners get the prizes. But how
can we achieve fairness, balance, equal coverage at the thin end?
In any written text (see [[en:Zipf's law]]), of all the words used
(the vocabulary), about half of them will occurr only once. If
the same mathematical distribution is applicable to topics in an
encyclopedia, about half of all articles in Wikipedia are at the
very thinnest end of the tail. If we were to use visitor
statistics to cut away the least notable topics, we could easily
cut away half of our stock. And that's hardly what we want.
So is there any other math we could do here?
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se