[Mediawiki-api] parse wikipedia
Roan Kattouw
roan.kattouw at home.nl
Sun Feb 22 19:40:01 UTC 2009
marco tanzi schreef:
> Hi folks,
>
> I am trying to work with the wikipedia API and i am having some little
> problems :-)
>
> I can fetch the main description of the topic i am looking for using:
>
> http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&rvsection=0&format=json&pageids=wiki_id
>
> I received a correct json object, but the content of the revision is
> full of data I do not need like {{....}} [[...]] ecc. I would like to
> get only the clean description, only text (like the one visible from
> the wiki website).
>
> How can I do that? there is some parser to clean my json object?
>
> hope someone could help me out|!
There's nothing cut out for you AFAIK. You can either get the wikitext
content (which is what you're doing now) or the HTML version through
action=parse. Your best bet would probably be to handle the {{}} and
[[]] stuff yourself using regexes or something.
Roan Kattouw (Catrope)
More information about the Mediawiki-api
mailing list