[Wikipedia-l] beat Britannica!

Daniel Mayer maveric149 at yahoo.com
Sun Sep 22 19:20:03 UTC 2002


On Sunday 22 September 2002 05:38 am, Robert Graham Merkel wrote:
> I'd like to suggest another occur at another milestone.
> I realise the problems with the article number targets, but they
> do serve as something useful to point to the media about.
> If we did, we'd of course have to be careful *not* to claim that
> "we have X number of articles, hence we're X/100,000 as good as
> Britannica at this point".

Exactly why I have mentioned several times now that we need to have a more 
conservative definition of what we let the software detect as articles. There 
seemed to be some consensus for this yet nothing has happened. I would write 
the new code myself but I know exactly zippo about php and have zero time to 
learn it. 

We are already at 45,000+ "articles", which is way too close to the "50%" 
mark for my taste. In order to preserve legitimacy, we should be 
conservative in our total reporting of these numbers. 

Living organisms requires a certain absolute minimum number of genes to be 
considered to be alive at all (which is around 300, but for practical 
purposes it is really a bit higher). I believe articles are similar and that 
it is not really possible to have an encyclopedia article on any subject that 
is less than 500 bytes (although for practical reasons this number is very 
often, but not always, too low).

So can we at least add a 500 byte filter to the current article count spec?
As has already been suggested, we can also add a line to the stats page that 
lists these "entries previously detected as being articles but are less than 
500 bytes". 

If we did have such a spec then there would be more reason than ever for 
people to make sure sub-500 byte stubs are fleshed-out.

-- Daniel Mayer (aka mav)




More information about the Wikipedia-l mailing list