Hi Legoktm
No I havent looked at it, I didnt actually know it existed. Looks interesting and well tested so it definitely looks better than what I wrote. It probably didnt exist at the time when I wrote my code so back then there was a need for it. As I said, what I have is a partial parser that worked to my needs, I hoped it might be useful for some people. At the time seemed to work reasonable well for my application of search and replace of template parameters but it seems that mwparserfromhell has much more functionality.
Hannes
On 12 April 2014 04:07, Legoktm legoktm.wikipedia@gmail.com wrote:
On 04/07/2014 05:35 AM, Hannes Röst wrote:
== Template parser ==
https://github.com/hroest/pywikibot-compat/tree/feature/template_parser
For one bot project on the German Wikipedia I had to parse rather complex templates and replace specific fields. The templates would contain nested templates, math formulas and references inside. I thus wrote a template parser which would parse these templates and return them as key-value pairs which would make it easy to query specific keys and replace their values. The code worked well on several thousand templates of the German chemistry project and should be rather straightforward to use. This is library code, so there is no bot associated with it, see templateparser.py and tests/test_templateparser.py
In order to correctly handle nesting and properly differentiate equal signs belonging to key-value pairs from those in mathematical formulas etc, I also had to write a partial wikimedia syntax parser which would recognize such syntax in wikitext. This code is in textrange_parser.py and allows to extract specific parts of a text (e.g. wikitables, templates, wikilinks, weblinks), tests are in tests/test_textrange_parser.py
Have you looked into using mwparserfromhell[1]? It's a true parser which even has C speedups. Support for it is already in pywikibot, it's just not turned on by default.
[1] https://github.com/earwig/mwparserfromhell
-- Legoktm
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l