[Wikitech-l] Re: xml wiki representation

14 Feb 2005


      Magnus Manske wrote:
...
Jim Higson schrieb:
...
A while ago I started some experimental client software that took the
output from wiki2xml, I got sidetracked but now I've got some more time
I'm wanting to get back to it.
A few questions:
I've searched the list and see there is now a proper flex/bison parser.
The wiki2xml convertor has not had any checkins for a while so I presume
it's now defunct?
Yup. If you know Bison, we'd be glad if you could take a look at it.
Especially the HTML parsing needs a lot of work.
I'm affraid not. I did a class last year in lex+yacc, so I mostly know my
way round a spec, but I've no experience using it for a real language,
especially one like wikitext which wasn't designed with formal grammars in
mind.
A quick overview of what I'm doing: For my undergraduate disertation I'm
writing a partial reimplementation of the mediawiki interface without any
dynamic component on the server. This isn't intended to replace the current
PHP interface, I am running it as an experiment into what is possible using
very low spec web servers.
At the moment what I've got uses a javascript half-port of wiki2xml. If the
project were to be taken any futher it would have to use a functionally
identical parser to the Bison one, which as far as I can see would involve
either modifying Bison to output javascript (very hard) or a C to
javascript converter (also very hard!). As you can probably tell, I'll
never fully reimplement the parsing process and don't intend this code to
be used except for as a neat demonstration. Still, I'd like my intermediate
XML format to be near the 'official' one because it is possible my
presentation layer might be teamed up with a server-side parser (using
something like &action=parsedxml instead of &action=raw). Even so, it isn't
trying to be a replacement interface because it places too many
requirements on the client and for the /Special:foo pages it will probably
always delegate to PHP. At best it might one day be possible to run this
project in parallel to a mediawiki wiki.
...
In the flexbisonparse module, there is also a "preprocessor" of mine
which tries to convert HTML to wiki text as far as possible, which might
then ease the parser code. Using the preprocessor, basically only <div>
and <font> need to be taken care of by the parser, and the usual wiki
tags (<pre>, <nowiki>, <math> etc.).
...
Does the flex/bison parser produce roughly the same XML as wiki2xml?
(same tag names, nesting etc)
No. But the new one is better! :-)
Good, except this means a bit more work for me ;)
...
...
Is there a DTD, XML schema for the wikiXML? How about a rough spec?
No DTD or the like, but try the example at the end of this mail (can't
attach files on the mailing list...)
Your help with the parser would be much appreciated.
I wish I could give more help with it. I can't really do much of anything
until this disertation is done. After that possibly.
The example was very helpful, thanks.
Jim

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: xml wiki representation