As I understand it, the problem with TeX is that it is an absolutely general Turing-complete programming language, and you can change the lexer on the fly, etc. We should use a subset of TeX with only the basic symbol-rendering stuff, and very little else.
Here's a syntax for a subset of TeX macros in a vaguely BNF-style notation, basically specifying anything that comes after a backslash.
http://www.csci.csusb.edu/dick/samples/comp.text.TeX.html http://www.csci.csusb.edu/dick/samples/comp.text.TeX.Mathematical.html
Would limiting TeX to using only these macros make it * safe? * complete enough for our purposes?
If so, then we could write a TeX-parser that would "sanitize" (and canonicalize, if necessary) any input TeX before letting the real TeX interpreter see it. Probably using Bison and C.
Neil
wikipedia-l@lists.wikimedia.org