On Sunday 09 March 2003 12:01 pm, Brion Vibber wrote:
Aha, again demonstrating the obsession over the count. Why was it important to hit or not hit 100,000? Because of an offhand remark made a couple years ago about "we hope to reach 100,000 articles"?
When did this become our holy mission?
Round numbers, especially large ones, are milestones that get people's attention. That is why x.0 is so important in the software world, why cities celebrate the day they reach 1,000,000 inhabitants, why there was so much mania when our calendars hit the year 2000, why the first billion-dollar business and billionares are mentioned in history books, and why we got a lot of media attention after en.wiki hit the 100,000 count.
The article count is also a measure (however crude) of our progress. So there is nothing wrong with trying to improve that measure and make it more conservative where it makes sense (Jimbo has already stated he wanted a more conservative count. However right after he said that we had already hit the 100,000 mark and were being slashdotted).
Did the messianic age begin when the counter flipped into six digits? Have we all been betrayed by a sinister being who wants to make us look bad by leading us astray and "inflating our count"?
What the *heck* does it matter?
Boy are you in a really bad mood today. See above.
Bad to whom? Embarrassing to whom? Is it solely the use of the word "article" that throws us off? Are we obsessed with proving that our "articles" are so fricking wonderful that every single one of them must be the greatest pinnacle of writing prowess or we must lock it in the basement of shame and never admit its existence?
No - a simple automatic measure is all that is needed. We mention the definition of the count on en.wikis [[Wikipedia:What is an article]] page.
Go open up a paper encyclopedia sometime. Look at it. A fair chunk of the articles are *one paragraph long*. Do their editors worry themselves over the metric they use to stamp "over 60,000 articles!" on the cover? Or do they just count the number of entries at some point and say "at least this many"?
Exactly - and how many bytes would a smallish complete paragraph be in such an encyclopedia? Around 500 bytes. Then we could say that we *at least* have x number of articles. Right now the count includes many entries that do not consist of even one complete paragraph. A per language set {{HEADLINEARTICLECOUNT}} would be flexible enough for both large and small wikis. {{NUMBEROFARTICLES}} would be used for comparison purposes.
Mav, thanks for proving my point again about count-mania. Are you seriously suggesting that the pseudo-random number spit out on the front page actually *defines* what articles are in a meaningful way?
Again, more unnecessary anger. Please calm down - we are not talking about anything of such cosmic importance to warrent such feelings. :-)
The answer to your question is above (the part talking about tracking our progress and how the outside world sees our progress). So, yes it is important to have a conservative estimate of the number of articles we have. That's not to say that everything a computer would recognize as an article is actually what a human would consider to be one. But since the computer will also miss entries that /could/ be considered articles, then everything averages out in the end (some really obscure subjects can, in fact, be covered in a sub-500 byte entry).
In short, I'm not asking for an AI article count - I just would like to see a more conservative crude method used on en.wiki that excludes more entries that are probably not articles (however we shouldn't go live with such a count until after have enough entries to still be above 100,000 - otherwise we could get some negative media attention and a drop in morale).
IMO the best way to do that is to have a per wiki set {{HEADLINEARTICLECOUNT}} in addition to {{NUMBEROFARTICLES}}. It would be up to each language to define their own byte threshold for their own headline count (or they could choose to ignore {{HEADLINEARTICLECOUNT}} and use the much less conservative {{NUMBEROFARTICLES}}. Of course, each wiki that uses {{HEADLINEARTICLECOUNT}} would then have to publicly document their threshold for their own headline count.
-- Daniel Mayer (aka mav)
WikiKarma The usual at [[March 8]] (I'm fresh out of WikiKarma so I need to work on creating some more balance in the Universe before I respond to your response).