On Thu, Jul 25, 2002 at 02:25:18AM -0700, lcrocker@nupedia.com wrote:
I'd love to have a formal grammar of some kind (I think regexps would be fine),
Hmm, I seem to remember I promised that once. :-/ I'll see what I can do. If people want to help, just go to
http://www.wikipedia.com/wiki/User:Jan_Hidders/Wikipedia_syntax
(I probably should put this on the meta-wikipedia.)
Just to be clear; the syntax should not describe what we accept and not accept (we accept actualy everything sot that's a really simple grammar :-)) but should have enough "resolution" to allow us to specify the semantics of the mark-up. We first should not concentrate on making it LALR(1) or anything, but just that it is unambiguous (in the parsing-sense of the word) and complete.
and I agree with Jan that a totally wiki-specific syntax would be far better than our current mish-mash of HTML and wiki markup. But I'm not sure if it's not already too late to revisit those decisions.
Was it a conscious decision? I got the impressions the early software didn't filter out HTML so people used it and now we are stuck with it.
Apart from the big technical advantages I still feel that having a simple HTML-free mark-up language is necessary to keep Wikipedia accessible for newcomers. Having lots of complicated HTML that is not very WYSIWIG makes editing harder. This inevitably means that you cannot do a lot of fancy lay-out things, but I believe that is not a bug but a feature.
So, yes, it is probably impossible to come up with an HTML-free mark-up that has an equivalent for all the HTML that is currently used. However, we would probably be breaking only a very small percentage of pages and we could even automatically detect those pages and put them on a "to be simplified" list.
But if it isn't, I'll be happy to discuss what a syntax might look like.
I have once made a proposal on
http://www.wikipedia.com/wiki/User:Jan_Hidders/HTML-free_mark-up
but I have to admit that it was mainly to draw some discussion.
-- Jan Hidders