I was thinking in terms of GB of text.

I too have wondered about creating closer ties between Wiktionary, Wikipedia and Wikisource so that it's easier for someone to start their search on one site and quickly find relevant pages on the other sites. This might (among other things) lead to an increase in pageviews. (Adding Toby to this email chain to see if he has any thoughts about that.) It would also conceivably lead to an increase in the "size" of Wikipedia (measured in bytes, content pages, and contributors) if Wiktionary and Wikisource were, for purposes of the reader, practically the same site. The downside might be increased complexity for contributors as the number of workflows increases, and the standards for inclusion may be different.

Pine


On Wed, Sep 16, 2015 at 12:21 AM, WereSpielChequers <werespielchequers@gmail.com> wrote:
I'm pretty sure that English Wikipedia is the largest English language encyclopaedia, but there are some humongous ones in China.

Baidu Baike with almost 12.5 million articles is way bigger than any one language version of Wikipedia and Baike.com formerly Hudong is about a million bigger still.

Ok they are more inclusionist than us, recipes included, and they have somewhat dropped the distinction between a dictionary and an encyclopaedia.

So you can claim that Wikipedia with near 35 million articles in 288 languages is the largest encyclopaedia ever. Adding wiktionary would make that even bigger.

Source Wikipedia - I'm afraid I don't speak Chinese to check them myself.

Of course articles is a flawed metric, combining almost all the individual Pokemon articles into a handful of lists reduced the number of Wikipedia articles by hundreds, but still left us with more information on Pokemon than I would want to see in a printed encyclopaedia. But then can anyone suggest a meaningful metric for comparing such projects; Participants? Contributed edits? Shelf space if printed in traditional encyclopaedia sized books? Gigabytes of text? Trays of microfiche?

Regards

Jonathan 


On 16 Sep 2015, at 01:24, Jonathan Morgan <jmorgan@wikimedia.org> wrote:

Hi Pine,

TL;DR: best to just say it's the largest encyclopedia ever. That should be safe.

Claims like this are hard to make because terms that seem concrete from afar tend to break down up close. For example: What do you mean by largest? 

Largest in bytes? Words? Content "units" (articles vs. manuscripts in this case, I guess)? Contributors?

What do you mean by "open text project"? Is archive.org an open text project? It has 8.2 million books. How would you compare the two? Does 1 book = 1 article?

Having said all that, I'm curious how others have/would craft a claim like this. My guess is that most of us who've written for an academic audience have settled for some variant of "largest encyclopedia" (you've got to put something in your Introduction paragraph, after all). What sayst?

J

On Tue, Sep 15, 2015 at 4:45 PM, Pine W <wiki.pine@gmail.com> wrote:
Hi researchers,

I could use a little help with understanding these dumps:

https://dumps.wikimedia.org/enwikisource/latest/

https://dumps.wikimedia.org/enwiki/20150901/

I'm trying to verify the claim that ENWP is the world's largest open text project, and to do that I need to verify that ENWP is larger than English Wikisource. Which files should I be comparing?

Are there any other projects that could make a claim to be a larger open text project than ENWP? Perhaps there's a library somewhere that has such a huge volume of out-of-copyright materials that the combined bytes of published text are larger than ENWP?

Thanks!

Pine


_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l