Jim Higson schrieb:
A while ago I started some experimental client software that took the output from wiki2xml, I got sidetracked but now I've got some more time I'm wanting to get back to it.
A few questions:
I've searched the list and see there is now a proper flex/bison parser. The wiki2xml convertor has not had any checkins for a while so I presume it's now defunct?
Yup. If you know Bison, we'd be glad if you could take a look at it. Especially the HTML parsing needs a lot of work.
In the flexbisonparse module, there is also a "preprocessor" of mine which tries to convert HTML to wiki text as far as possible, which might then ease the parser code. Using the preprocessor, basically only <div> and <font> need to be taken care of by the parser, and the usual wiki tags (<pre>, <nowiki>, <math> etc.).
Does the flex/bison parser produce roughly the same XML as wiki2xml? (same tag names, nesting etc)
No. But the new one is better! :-)
Is there a DTD, XML schema for the wikiXML? How about a rough spec?
No DTD or the like, but try the example at the end of this mail (can't attach files on the mailing list...)
Your help with the parser would be much appreciated.
Magnus
Example :
This is '''bold''' and ''italics'' and '''''both'''''.
List test * dot ** two dots # number ## two numbers #* number, dot
Link test : [[solo link]], [[target|text]], [[image:test.jpg|thumb|100px|text]], [[target|]]
:Indent
{| ! a th-like element |parameter| a cell | another cell |- |parameter=something| another cell, another row |}
== Heading 2 == === Heading 3 ===
<nowiki>A nowiki text</nowiki>
XML: <article><paragraph>This is <bold>bold</bold> and <italics>italics</italics> and <italics><bold>both</bold></italics>.</paragraph><paragraph>List test</paragraph><list type='bullet'><listitem>dot<list type='bullet'><listitem>two dots</listitem></list></listitem></list><list type='numbered'><listitem>number<list type='numbered'><listitem>two numbers</listitem></list><list type='bullet'><listitem>number, dot</listitem></list></listitem></list><paragraph>Link test : <link><linktarget>solo link</linktarget></link>, <link><linktarget>target</linktarget><linkoption>text</linkoption></link>, <link><linktarget>image:test.jpg</linktarget> <linkoption>thumb</linkoption> <linkoption>100px</linkoption> <linkoption>text</linkoption></link>, <link emptypipeatend='yes'><linktarget>target</linktarget></link></paragraph><list type='indent'><listitem>Indent</listitem></list><table><tablerow><tablehead>a th-like element</tablehead><tablecell><attrs><attr name='parameter' isnull='yes'></attr></attrs> a cell</tablecell><tablecell>another cell</tablecell></tablerow><tablerow><tablecell><attrs><attr name='parameter'>something</attr></attrs> another cell, another row</tablecell></tablerow></table><heading level='2'> Heading 2 </heading><heading level='3'> Heading 3 </heading><paragraph><extension name='nowiki'>A nowiki text</extension></paragraph></article>