Am Samstag, den 17.12.2005, 17:33 +0200 schrieb Yaroslav Fedevych:
The main goal is, of course, to be able to convert wikitext to as many output formats as possible. That is why I'll first try to write a parser function to return me the document tree I would be able to parse.
So I sat down for half an hour and tried to identify each and every entity wikitext might consist of.
The results are available here: http://docs.linux.org.ua/~jaroslaw/?p=8
If anyone is interested in this, I encourage to comment on this table, especially if I've got something wrong, or the table is not complete.
For a wiki-to-XML converter that works reasonably well (a little slow though, especially with tables), see CVS, "wiki2xml" module, "php" directory. Or:
http://cvs.sourceforge.net/viewcvs.py/wikipedia/wiki2xml/php/
It works for wikitext as a source, or a list of page titles. It can even replace templates "live".
I've started to add an OpenDocument text output. Any help would be appreciated.
Magnus