Hi researchers,

I could use a little help with understanding these dumps:

https://dumps.wikimedia.org/enwikisource/latest/

https://dumps.wikimedia.org/enwiki/20150901/

I'm trying to verify the claim that ENWP is the world's largest open text project, and to do that I need to verify that ENWP is larger than English Wikisource. Which files should I be comparing?

Are there any other projects that could make a claim to be a larger open text project than ENWP? Perhaps there's a library somewhere that has such a huge volume of out-of-copyright materials that the combined bytes of published text are larger than ENWP?

Thanks!

Pine