Il 13/12/2014 13:30, misaka83(a)hush.com ha scritto:
Hello everyone,
how can I extract just clean plain text from a Wikipedia article?
Without wiki-stuff, without html, without pictures, without json.
Just clean text.
I can't seem to find this exact solution.
Best regards
Mikoto
Use the TextExtracts API
<https://www.mediawiki.org/wiki/Extension:TextExtracts#API>.
For example, this query
<https://en.wikipedia.org/w/api.php?action=query&titles=Douglas+Adams&prop=extracts&explaintext=1&exintro=1>
returns the lead section of the English Wikipedia article Douglas Adams
<https://en.wikipedia.org/wiki/Douglas_Adams> in plain text.