Sebastian Graf wrote:
Hello Tomasz,
thanks for your quick response.
Unfortunately I am in need not only of *text* but of *english text* since we are currently working on an revisioned indexer.
Are there any english dumps available except the enwiki?
Yup, you can grab
enwikisource enwikiversity enwikinews enwiktionary enwikiquote metawiki commonswiki
--tomasz
Tomasz Finc wrote:
Sebastian Graf wrote:
Hello Tomasz,
thanks for your quick response.
Unfortunately I am in need not only of *text* but of *english text* since we are currently working on an revisioned indexer.
Are there any english dumps available except the enwiki?
Yup, you can grab
enwikisource enwikiversity enwikinews enwiktionary enwikiquote metawiki commonswiki
--tomasz
Don't forget about simple wikipedia. It's the kind of content you'd find in wikipedia, but with simpler terms and a much much smaller dump size :)
Thank you very much, I found multiple usable dumps..
greetings
sebastian
Am 14.06.2009 um 21:51 schrieb Platonides:
Tomasz Finc wrote:
Sebastian Graf wrote:
Hello Tomasz,
thanks for your quick response.
Unfortunately I am in need not only of *text* but of *english text* since we are currently working on an revisioned indexer.
Are there any english dumps available except the enwiki?
Yup, you can grab
enwikisource enwikiversity enwikinews enwiktionary enwikiquote metawiki commonswiki
--tomasz
Don't forget about simple wikipedia. It's the kind of content you'd find in wikipedia, but with simpler terms and a much much smaller dump size :)
-------------------------------------------------- Sebastian Graf Distributed Systems Lab University of Konstanz Phone: +49 7531 88 4319 Mail: sebastian.graf@uni-konstanz.de
xmldatadumps-l@lists.wikimedia.org