Hi,
I am new to the list. I would like to parse wikitext files from wikipedia and have some doubts:
1) what is the best parser that gets the best rendering for wikipedia content ? And the fastest parser ?
2) If I decide to write my parser, how can I handle rendering of infoboxes/macros ? Maybe the idea is to parse the wikitext but handle the macros/infoboxes/templates with a separate library (as I imagine there are lots of different templates and it might be silly to try to parse/render each one as this might be handled well for existing libraries)
thanks and regards jose
Jose escribió:
Hi,
I am new to the list. I would like to parse wikitext files from wikipedia and have some doubts:
- what is the best parser that gets the best rendering for wikipedia
content ?
MediaWiki parser.
And the fastest parser ?
I don't know a parser comparison but you can be really fast if you don't mind dropping some features.
- If I decide to write my parser,
That's a bad idea.
how can I handle rendering of infoboxes/macros ?
They're called templates. There's a preprocessing step which substitutes templates. You then parse the result.
Maybe the idea is to parse the wikitext but handle the macros/infoboxes/templates with a separate library (as I imagine there are lots of different templates and it might be silly to try to parse/render each one as this might be handled well for existing libraries)
You can't parse each template separately, as templates are not independent, they have parameters.
thanks and regards jose
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
wikitext-l@lists.wikimedia.org