On Thu, Apr 23, 2009 at 1:40 PM, David Gerard dgerard@gmail.com wrote:
[cc'ed to mediawiki-l] [from wikien-l - discussion of git-backed MediaWiki]
2009/4/23 Gwern Branwen gwern0@gmail.com:
As it happens, I've thought about this before and have a little expertise in the issue. I'm one of the developers of a wiki called Gitit - http://github.com/jgm/gitit/tree/master - written in Haskell. The most interesting thing about Gitit, besides its ability to export articles (written in Markdown or ReST) in various formats such as HTML or PDFs or LaTeX, is that it uses a library called 'filestore' - http://hackage.haskell.org/cgi-bin/hackage-scripts/package/filestore - to access and change articles.
While the idea of putting lumps of PHP into an otherwise Haskell project is really quite horrifying, the parser that (literally) defines MediaWiki wikitext could to some degree be made into a module for use elsewhere. I understand it's not entirely cleanly separated out in the MediaWiki codebase, but if you could do that you'd at least have something that quite definitely processed MediaWiki wikitext precisely as MediaWiki does.
This might be better for mediawiki-l ...
- d.
The library/executable actually doing all the translating, Pandoc, has a relatively limited MediaWiki capability - it's just output, as I said, not input. It's much easier to take your intermediate representation language and turn it into a small subset of correct MediaWiki markup (omitting such things as templates, which don't exist in Markdown & ReST) than it is to parse the wild-and-wooly world of MediaWiki files (as many projects have discovered to their dismay).