I've been spending hours on the parsing now and
don't find it simple
at all due to the fact that templates can be nested. Just extracting
the Infobox as one big lump is hard due to the need to match nested {{
and }}
Andrew Dunbar (hippietrail)
Hi,
Come now, you are over-thinking it. Find "{{Infobox [Ll]anguage" in
the text, then count braces. Start at depth=2, count up and down 'till
you reach 0, and you are at the end of the template. (you can be picky
about only counting them if paired if you like ;-)
Then just regex match the lines/parameters you want.
However, if you are pulling the wikitext with the API, the XML parse
tree option sounds good; then you can just use elementTree (or the
like) and pull out the parameters directly
Robert