Hi Pine,
TL;DR: best to just say it's the largest encyclopedia ever. That should be
safe.
Claims like this are hard to make because terms that seem concrete from
afar tend to break down up close. For example: What do you mean by largest?
Largest in bytes? Words? Content "units" (articles vs. manuscripts in this
case, I guess)? Contributors?
What do you mean by "open text project"? Is
archive.org an open text
project? It has 8.2 million books. How would you compare the two? Does 1
book = 1 article?
Having said all that, I'm curious how others have/would craft a claim like
this. My guess is that most of us who've written for an academic audience
have settled for some variant of "largest encyclopedia" (you've got to put
something in your Introduction paragraph, after all). What sayst?
J
On Tue, Sep 15, 2015 at 4:45 PM, Pine W <wiki.pine(a)gmail.com> wrote:
Hi researchers,
I could use a little help with understanding these dumps:
https://dumps.wikimedia.org/enwikisource/latest/
https://dumps.wikimedia.org/enwiki/20150901/
I'm trying to verify the claim that ENWP is the world's largest open text
project, and to do that I need to verify that ENWP is larger than English
Wikisource. Which files should I be comparing?
Are there any other projects that could make a claim to be a larger open
text project than ENWP? Perhaps there's a library somewhere that has such a
huge volume of out-of-copyright materials that the combined bytes of
published text are larger than ENWP?
Thanks!
Pine
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>