Can we deconstruct the current parser's processing steps and build a set of rules that must be followed?

This stikes me as an area where the very few places where this kind of strange mixed style nesting is rare enough we may even be able to introduce a little bit of reform without much ill effect on the general body of wikitext out there.

I think we need to get a dump of English Wikipedia and start using a simple PEG parser to scan through it looking for patterns and figuring out how often certain things are used - if ever.

Ward Cunninham had a setup that could do this sort of thing on a complete en-wiki dump in like 10-15 minutes, and a fraction of the dump (still tens of thousands of article in size) in under a minute. We supposedly have access to him and his mad science laboratory - now would be a good time to get that going.

- Trevor

On Thu, Nov 10, 2011 at 8:00 AM, Gabriel Wicke <wicke@wikidev.net> wrote:
Sumana,

the regular

> ; bla : blub

is actually not the issue. More problematic are for example:

 ;; bla :: blub

 *; bla : blub

or even the simple

 ;; bla

Right now the behavior is quite inconsistent:
 http://www.mediawiki.org/wiki/User:GWicke/Definitionlists

The bug discussing this is
 https://bugzilla.wikimedia.org/show_bug.cgi?id=6569

Treating '; bla : blub' as a tightly-bound special-case construct seems
to me the simplest way to make this area more consistent while avoiding
very ugly syntax. This would mean that

 *; bla : blub

is treated as equivalent to
 *; bla
 *: blub

and
 ;; bla :: blub

is equivalent to
 ;; bla
 ;: :blub

What are your preferences on this? Is any of these cases commonly used
today?

Gabriel


_______________________________________________
Wikitext-l mailing list
Wikitext-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l