"Quick" request - Wiki-research-l

22 Feb 2016

Hi,

I was wondering if there is any place where I can find text (without
markup, etc) only versions of wikipedia suitable for NLP tasks? I've been
able to find a couple of old ones for the english wikipedia but I would
like to analyze different languages (mandarin, arabic, etc...).

Of course, any pointers to software that I can use to convert the usual XML
dumps to text would be great as well.

Best,

Bruno

*******************************************
Bruno Miguel Tavares Gonçalves, PhD
Homepage: www.bgoncalves.com
Email: bgoncalves(a)gmail.com
*******************************************