[Wikipedia-l] Wikipedia v. Britannica

Jimmy Wales jwales at bomis.com
Mon Aug 18 14:37:43 UTC 2003

As a quick followup, Britannica also claims that the 32 volume print
encyclopedia has 44 million words.

I just picked a random article of 2,245 bytes, which was also 376
words.  That implies just under 6 bytes per word.  This statistic
could be improved by checking more articles, but the article looks
pretty normal to me, so I think it's basically a sensible ballpark
figure for now.

150,000 articles averaging 2,126 bytes means 318,900,000 bytes total
or just over 53 million words.

Considering *just* the 75,000 articles over 1500 bytes and assuming
conservatively that these are all *only* 1500 bytes long (manifestly
untrue), we are looking at 18,750,000 words for just those longer

It seems clear to me that we are already "in the ballpark" of the size
of Britannica.  Quality is, of course, an entirely different question.
I think we are often superior and often drastically inferior.  I
susppect that our coverage contains strange and conspicuous 'holes' if
we went through it via a "top down" approach, i.e. take lists of major
topics and see if we've covered them.


More information about the Wikipedia-l mailing list