I'm pretty sure that English Wikipedia is the largest English language encyclopaedia, but there are some humongous ones in China.

Baidu Baike with almost 12.5 million articles is way bigger than any one language version of Wikipedia and Baike.com formerly Hudong is about a million bigger still.

Ok they are more inclusionist than us, recipes included, and they have somewhat dropped the distinction between a dictionary and an encyclopaedia.

So you can claim that Wikipedia with near 35 million articles in 288 languages is the largest encyclopaedia ever. Adding wiktionary would make that even bigger.

Source Wikipedia - I'm afraid I don't speak Chinese to check them myself.

Of course articles is a flawed metric, combining almost all the individual Pokemon articles into a handful of lists reduced the number of Wikipedia articles by hundreds, but still left us with more information on Pokemon than I would want to see in a printed encyclopaedia. But then can anyone suggest a meaningful metric for comparing such projects; Participants? Contributed edits? Shelf space if printed in traditional encyclopaedia sized books? Gigabytes of text? Trays of microfiche?



On 16 Sep 2015, at 01:24, Jonathan Morgan <jmorgan@wikimedia.org> wrote:

Hi Pine,

TL;DR: best to just say it's the largest encyclopedia ever. That should be safe.

Claims like this are hard to make because terms that seem concrete from afar tend to break down up close. For example: What do you mean by largest? 

Largest in bytes? Words? Content "units" (articles vs. manuscripts in this case, I guess)? Contributors?

What do you mean by "open text project"? Is archive.org an open text project? It has 8.2 million books. How would you compare the two? Does 1 book = 1 article?

Having said all that, I'm curious how others have/would craft a claim like this. My guess is that most of us who've written for an academic audience have settled for some variant of "largest encyclopedia" (you've got to put something in your Introduction paragraph, after all). What sayst?


On Tue, Sep 15, 2015 at 4:45 PM, Pine W <wiki.pine@gmail.com> wrote:
Hi researchers,

I could use a little help with understanding these dumps:



I'm trying to verify the claim that ENWP is the world's largest open text project, and to do that I need to verify that ENWP is larger than English Wikisource. Which files should I be comparing?

Are there any other projects that could make a claim to be a larger open text project than ENWP? Perhaps there's a library somewhere that has such a huge volume of out-of-copyright materials that the combined bytes of published text are larger than ENWP?



Wiki-research-l mailing list

Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation

Wiki-research-l mailing list