Il 13/12/2014 13:30, misaka83@hush.com ha scritto:
Hello everyone,
how can I extract just clean plain text from a Wikipedia article? Without wiki-stuff, without html, without pictures, without json.
Just clean text.
I can't seem to find this exact solution.
Best regards
Mikoto
Use the TextExtracts API https://www.mediawiki.org/wiki/Extension:TextExtracts#API. For example, this query https://en.wikipedia.org/w/api.php?action=query&titles=Douglas+Adams&prop=extracts&explaintext=1&exintro=1 returns the lead section of the English Wikipedia article Douglas Adams https://en.wikipedia.org/wiki/Douglas_Adams in plain text.