Re: [Wikitext-l] Parser for Perl or PHP?

14 Dec 2009


      Platonides platonides@gmail.com wrote:
...
...
Any pointers to things that I overlooked? Thoughts on in-
terfaces & Co.? Volunteers? :-)
...
It's a bit hard for me to understand what your tool does, since it gives
a blank page when English is selected, and it takes the html source
instead of the wiki source.
Ah! Didn't notice that. It works (solely) on the wiki
source, though.
...
I get that you look for two kind of bugs: "wiki text errors" (like an
unclosed tag) and "wikipedia errors" (the date doesn't conform to the
manual of style).
[...]
It does mostly the latter, but I'm not looking for some
grammar to define an article complying with a manual of
style, but for a parser to parse wikitext.
...
[...]
I have dealt with the parser a bit (see bug 18765) and I don't think we
could make some things remotely sane as they are handled at completely
different steps. But linting completely insane ones shouldn't be too
hard. :)
...
On the other hand, going into the Parser is probably quite far from what
you expected when wanting to leave your ugly mess of regexes. Also, I
may have misunderstood your position and it may not be appropiate for
your lint expectations.
I think so :-). My use case with wikilint and some other
tools is:
- Are there more than one and less than x images per arti-
  cle?
- Is there more than one link to another article?
- Are there links in a "See also" section that have already
  appeared in the article?
- If there are "Main article:" links, do they appear direct-
  ly following a section heading indented and italic?
- Does the {{Personendaten}} data have a fuzzy relationship
  to the introductory line of the article?
To address these, I'd like to parse the wiki source from a
concatenation of characters to a logical structure. The
MediaWiki parser does not seem to care for that, so I have
not looked further into that (and don't plan to do so).
So, to emphasize: I'm looking for *a* parser, that's a
lowercase "p".
Tim

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [Wikitext-l] Parser for Perl or PHP?

Re: [Wikitext-l] *Parser* for Perl or PHP?

Re: [Wikitext-l] Parser for Perl or PHP?