Warning: Yeat Another Crazy Idea of Mine ahead. If you're sick of these
(by bitter experience;-) delete this mail *now*.
Still here? Great!
OK, we all know that the current parser, while working, is not the final
word. It is kinda slow due to multi-pass, the source is confusing, and
there are some persistant bugs in it, like the template malfunctions.
I therefore suggest a new structure:
1. Preprocessor
2. Wiki markup to XML
3. XML to (X)HTML
Let me go through these. The preprocessor would basically do the
template and variable stuff; dumb text replacement, like a C/C++
preprocessor. It would generate the complete wiki-markup text, which is
then carefully chewed by the to-XML-coverter. The XML output is then
converted into HTML/XHTML for display.
Why have this XML step in there? Couple of reasons.
I think of XML which can be generated from the wiki text /without any
knowledge of the rest of the database/. The converter will not check,
"does this article exist" or "is this an internal link, an image, an
interwiki link, or what?". It should only convert "[[this|or that]]" to
"<wikilink page='this'>or that</wikilink>". Also, it should produce only
valid XML, no matter the input.
IMHO that would separate the actual *parsing* of the wiki markup from
its *meaning*. There are useful methods, functions, libraries
and-what-not for dealing with XML.
To have a few points working in favor of this idea:
* The wiki parser (#2) can be clean, brief and efficient without
worrying about the context of the page; it can focus on parsing wiki markup.
* The XML-HTML-converter (#3) can focus on the pure context of the page:
make a normal link, stub link, thumbnailed image, or a category or
interwiki link, etc.
* Both can be developed and maintained independently. To make thumbnail
display behave differently, I won't have to look at the dirty wiki
markup parsing at all ;-)
* We could have a decent XML output function at no cost.
* Other wikis, used to a different syntax, could easily switch to
MediaWiki by adapting the wiki-to-XML module.
For wiki-to-XML, we could even use an external parser. I am currently
toying around with one in C++ (yes, another one, and yes, one could
probably write it in two lines of Perl. So, go ahead!;-)
I *do* realize that this would mean a great change in our current
software. Therefore, I do not demand for this to be implemented in 1.3.1 :-)
But, in the long run, I strongly believe that this is the way to go. The
performace loss from doing two steps instead of one might even be
compensated for by increased performance of each specialized part. The
value of making the parser more modular, however, should IMHO not be
underestimated.
Magnus
P.S.: Hurricane "Charley" - Britannica's last hope... ;-)