-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
One important question for me here is "How is the handling/behaviour for malform(at)ed wiki syntax, like e.g. a text body:
### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ###
=== bad header 1 ==
A text containing mathematical equations like a = b + c or even something that could look like a header, like a == b == c but is anything BUT a header.
== bad header 2 ===
Some lorem ipsum bla bla ...
### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ###
I am despirately seeking a parser that has the same error behaviour and gives the same results like the original mw parser also in case of malform(at)ed wiki text.
Is this the case for mwparserfromhell??
Thanks and Greetings DrTrigon
On 24.04.2013 13:11, legoktm wrote:
Hi all, I had mentioned this in the rewrite roadmap, and noticed it came up on IRC as well, so I'd like to run this by the mailing list:
User:The Earwig has written a pure-python (with optional C-speedups) MediaWiki text parser named mwparserfromhell[1]. Currently we have the textlib library and some various regexes that implement this in a non-perfect way. From my experience using mwparser (over 400k successful edits with no issues) I believe it is ready to be bundled with the framework. I think it would still be a good idea to keep textlib in as a fallback or for users who are currently using it and don't need to migrate.
As for actually adding it, in the rewrite branch we can just add it as a dependency in setup.py, and then convert various methods over. In trunk, I'm guessing we would need to add it as an external. (I'm not sure how that's actually done.)
[1] https://github.com/earwig/mwparserfromhell
-- Legoktm
_______________________________________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l