On 25/01/07, Lars Aronsson <lars(a)aronsson.se> wrote:
In any written text (see [[en:Zipf's law]]), of
all the words used
(the vocabulary), about half of them will occurr only once. If
the same mathematical distribution is applicable to topics in an
encyclopedia, about half of all articles in Wikipedia are at the
very thinnest end of the tail. If we were to use visitor
statistics to cut away the least notable topics, we could easily
cut away half of our stock. And that's hardly what we want.
So is there any other math we could do here?
The metric I would love to see is some way of identifying when
[amount of value gained to our readers by this article] << [amount of
hassle caused to our volunteers by having this article]
where "hassle" is deletions, cleanup, vandalism repair, mentoring
editwars, and the like, whilst "value" is... well, value. People
gaining useful information from it.
(Teenagers playing with the article to call their headmaster a child
molestor is not "value", even though it may seem the perfectly
sensible use to them, nor is using the article to promote a
business... "value" is pretty much a function of quality times
readers)
Unfortunately, it's almost entirely imopssible to calculate except by
gut feeling, and entirely impractical to implement. Ah, well.
--
- Andrew Gray
andrew.gray(a)dunelm.org.uk