The main goal is, of course, to be able to convert wikitext to as many output formats as possible. That is why I'll first try to write a parser function to return me the document tree I would be able to parse.
So I sat down for half an hour and tried to identify each and every entity wikitext might consist of.
The results are available here: http://docs.linux.org.ua/~jaroslaw/?p=8
If anyone is interested in this, I encourage to comment on this table, especially if I've got something wrong, or the table is not complete.
Am Samstag, den 17.12.2005, 17:33 +0200 schrieb Yaroslav Fedevych:
The main goal is, of course, to be able to convert wikitext to as many output formats as possible. That is why I'll first try to write a parser function to return me the document tree I would be able to parse.
So I sat down for half an hour and tried to identify each and every entity wikitext might consist of.
The results are available here: http://docs.linux.org.ua/~jaroslaw/?p=8
If anyone is interested in this, I encourage to comment on this table, especially if I've got something wrong, or the table is not complete.
For a wiki-to-XML converter that works reasonably well (a little slow though, especially with tables), see CVS, "wiki2xml" module, "php" directory. Or:
http://cvs.sourceforge.net/viewcvs.py/wikipedia/wiki2xml/php/
It works for wikitext as a source, or a list of page titles. It can even replace templates "live".
I've started to add an OpenDocument text output. Any help would be appreciated.
Magnus
On 17/12/05, Magnus Manske magnus.manske@web.de wrote:
For a wiki-to-XML converter that works reasonably well (a little slow though, especially with tables), see CVS, "wiki2xml" module, "php" directory. Or:
And for other attempts (just for interest's sake, Magnus's is probably the most worthy) see http://meta.wikimedia.org/wiki/Alternative_parsers
-- Rowan Collins BSc [IMSoP]
On 17/12/05, Magnus Manske magnus.manske@web.de wrote:
For a wiki-to-XML converter that works reasonably well (a little slow though, especially with tables), see CVS, "wiki2xml" module, "php" directory.
Hey Magnus,
I am definitly interested in helping out. Have you abandoned the C++ stuff and are you currently working only on the php code?
Is there a better place to discuss the wiki2xml or should it stay on [Mediawiki-l]?
Cheers,
Michael
mediawiki-l@lists.wikimedia.org