On 11/9/05, Magnus Manske magnus.manske@web.de wrote:
Timwi wrote:
My first (two-character) test input:
{|
yields invalid XML as output. ;-) In general, it does so whenever the close-table markup (|}) is missing.
I hacked in a fix for nested tables minutes before announcing here, so it's probably a side effect of that. I'll have a look, thanks for noticing.
Also, you seem to be ignoring all whitespace at the beginning of the input, which makes it output a <paragraph> when the first line should have been a <pre> because it starts with a space.
Yep. Already fixed by changing "trim" to "rtrim" :-)
Otherwise: Very impressive!!
Thanks! I hope with added OpenDocument export, this will become useful one day.
It's useful already... The complexity of the wikitext syntax (from a programmers perspective) is quite high and this adds a substantial level of friction to creating tools which can look for content in pages. Even doing something as simple as extracting all the text of an article and excluding content in images can be a pain. The XML representation is much easier to work with.