TL;DR Many older parser bugs were pushed to the new parser. However, several, mostly newer, bugs were assigned to be worked on now. To contribute to the new call for bugs, read http://hexm.de/20.
This past Monday, I held a bug triage meeting with WMF developers to cover Parser bugs that I and other participants on this list had marked for triage. There were 27 bugs in total. Because of the open-ended nature of the call for bugs, we ended up with a fair number of “oldie-but-goodie” bugs. With Brion's parser rewrite currently being planned, this meant that a fair number of bugs were easy to shrug off. If we've lived with the pain so far, the rationale is, why go through they horrible pain of fixing it in the current parser when a more maintainable parser is just around the corner.
Still, it was helpful to look at these older bugs to see the sort of problems that need to be addressed in the parser rewrite. The bugs that we decided it would be better to address in the new parser were the following:
Preceding text and single apostrophes are not included in links http://bugzilla.wikimedia.org/468
Incorrect parsing of table headings and cells on the same line http://bugzilla.wikimedia.org/549
[[#foo|]], [[/bar|]] should be equivalent to [[#foo|foo]], [[/bar|bar]] (new use of "pipe trick") http://bugzilla.wikimedia.org/845
Newline as list item terminator is troublesome http://bugzilla.wikimedia.org/1115
pre over multiple lines in lists http://bugzilla.wikimedia.org/1581
Need method for multiparagraph list items, continuing numbered lists, and assigning specific numbers to list items http://bugzilla.wikimedia.org/1584
Allow one blank line in list environments http://bugzilla.wikimedia.org/9342
Automatic nbsp is inserted even into XHTML attributes, including style http://bugzilla.wikimedia.org/3158
The newline added to a template, magic word, variable, or parser function that returns line-start wikicode formatting (*#:;) causes unexpected parsing http://bugzilla.wikimedia.org/12974
Leading spaces in <pre> block render incorrectly when block preceded by another <pre> http://bugzilla.wikimedia.org/3230
Blank lines at the top of an article should be ignored http://bugzilla.wikimedia.org/4161
Single newlines sometimes create paragraphs http://bugzilla.wikimedia.org/9207
Block element written inline splits multiline paragraphs http://bugzilla.wikimedia.org/5718
Linebreaks are mishandled in <blockquote> and <li> http://bugzilla.wikimedia.org/6200
Multiline tags in lists should be output more intelligently http://bugzilla.wikimedia.org/9996
Bold/italic markup handled differently depending on leading whitespace http://bugzilla.wikimedia.org/18765
post expand size counted multiple times for nested transclusions http://bugzilla.wikimedia.org/13260
Additionally, Brion punted this bug to the new rich text editor he has planned since the problem is seen mostly in copy-and-pasted URLs:
External URL syntax cannot handle square brackets http://bugzilla.wikimedia.org/3695
One bug was dismissed as WONTFIX with the justification that the reporter had a certain behavior in mind for the behavior of the parser when he wrote ''''lots of quotes'''' but that while the parser acted consistently, it didn't act in his preferred manner
Single quote inside triple quote bold (''') parsing error http://bugzilla.wikimedia.org/13227
Still, all was not lost. For example, Neil saw this ancient bug as an opportunity to get closer to the gnarly internals of MediaWiki.
tilde signatures inside nowiki tags sometimes get expanded (<includeonly><nowiki>~~~~</nowiki></includeonly>) http://bugzilla.wikimedia.org/93
Sam saw this bug and decided it looked like it would be easy to test and apply the included patch:
Transcluded special pages expose strip markers when they output parsed messages http://bugzilla.wikimedia.org/16129
Finally, Tim saw these two relatively recent bugs and decided he would investigate them further and hopefully fix them:
DOM preprocessor barfs on headings inserted by parser functions http://bugzilla.wikimedia.org/21844
{{fullurl:}} does not urlencode passed querystring http://bugzilla.wikimedia.org/27972
To see the notes from the Bug Triage (thanks, Sumana!) visit http://etherpad.wikimedia.org/BugTriage.
Please see my earlier email to the list (http://hexm.de/20) if you'd like to contribute to this coming week's triage.
Mark.
wikitech-l@lists.wikimedia.org