First, I notice there are some very useful links here: http://meta.wikimedia.org/wiki/Alternative_parsers
There seems to be vague consensus that: - It would be a good idea to replace the parser with a simpler parser using a more traditional method - Some parts of the grammar are virtually impossible to implement in such a parser - We may have to modify those bits of the grammar - We will have to take very careful steps to roll out the new parser, to avoid community outcry and breaking existing wikitext - We would like to know which bits of the grammar are likely to be affected and how important they are
Can I suggest that as a first step, we produce a table of the form:
Language feature | Difficulty of implementation | Changes required | Impact of changes | <link to EBNF>
where the rows are sorted in the order they are processed by the current parser.
So for example,
Nowiki | Easy | None | - | ... Nested lists | Hard (I think) | To be determined | To be determined | ...
etc.
The major enlightenment that will come out of this is we will see *all* the major problems, not just the ones that keep being raised, like bold/italics.
Steve
On Mon, Nov 12, 2007 at 10:30:08AM +1100, Steve Bennett wrote:
First, I notice there are some very useful links here: http://meta.wikimedia.org/wiki/Alternative_parsers
There seems to be vague consensus that:
- It would be a good idea to replace the parser with a simpler parser using
a more traditional method
- Some parts of the grammar are virtually impossible to implement in such a
parser
- We may have to modify those bits of the grammar
- We will have to take very careful steps to roll out the new parser, to
avoid community outcry and breaking existing wikitext
- We would like to know which bits of the grammar are likely to be affected
and how important they are
Can I suggest that as a first step, we produce a table of the form:
Language feature | Difficulty of implementation | Changes required | Impact of changes | <link to EBNF>
where the rows are sorted in the order they are processed by the current parser.
So for example,
Nowiki | Easy | None | - | ... Nested lists | Hard (I think) | To be determined | To be determined | ...
I would make two suggestions: move Impact before Changes, and sort by *it*. I continue to think that that metric will be the killer here.
Cheers, -- jra
On 11/12/07, Jay R. Ashworth jra@baylink.com wrote:
I would make two suggestions: move Impact before Changes, and sort by *it*. I continue to think that that metric will be the killer here.
I guess. It's very subjective though. IMHO, a parser that simply refuses to handle any complicated combinations of ''' and '' (that is, it never attempts to render a ' literally if it is touching a combination of '', ''' or ''''') would have a very minor impact if we provided an alternative syntax. Even if it refused to allow constructions like '''''foo''' blah''. But there might be an outcry.
Steve
On Mon, Nov 12, 2007 at 09:07:33PM +1100, Steve Bennett wrote:
But there might be an outcry.
Certainly. And that's what I'm trying to head off at the parse[0], by suggesting that we need to gather stats.
Cheers, -- jra
[1] Oooof.
If you want to get another parser accepted you'd do well to start by having the implementations run the current test suite. The difficulty of implementing each parser feature depends largely on how your parser is written, and in any case implementing things is more important than speculating about their supposed difficulty.
The test suite is also something that people will be forced to update, so cataloging parser features there is much better than keeping them in some document that may or may not be updated in the future.
On 11/13/07, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:
The test suite is also something that people will be forced to update, so cataloging parser features there is much better than keeping them in some document that may or may not be updated in the future.
Good idea.
Steve
wikitech-l@lists.wikimedia.org