The syntax of image links with caption is seriously flawed, but I think that I have found a reasonable solution for handling them: parse them as "inline blocks".
To make an inline block out of the image link with caption, we first let it have its own block context in the lexer, in order to guarantee nexting order of internal block elements. This means that the end token cannot appear in the wrong block context:
[[File:example.jpg|<table><td> this ]] is not an end token for the image link</table> but this ]] is
I have already discussed the image links in the context of speculative execution in the lexer, to guarantee that any opened image link will be followed by an image link closing token. The max nesting level for links is limited to 2 to avoid pathological speculations.
In the parser, inline blocks may appear in inlined text lines. They will break the inlined text line from the point of view of handling apostrophe parsing, however. Since block elements may appear in the image caption, they cannot be part of the lookahead that is performed for scanning for apostrophes. This means that in this example:
text '' italic [[File:example.jpg| text ]] foo '' bar
the text "text '' italic" and the text " foo '' bar" are processed separately when it comes to apostrophe parsing and the result will be:
<p>text <i> italic</i><a ...><img ..></a>foo <i> bar </i></p>
Which is different from the current parser, where we have:
<p>text <i> italic<a ...><img ..></a>foo </i> bar</p>
However, the behavior will be the same regardless of new lines in the caption:
text '' italic [[File:example.jpg| text text ]] foo '' bar
still:
<p>text <i> italic</i><a ...><img ..></a>foo <i> bar </i></p>
The original parser have problems:
<p>text <i> italic<a ...><img ..></a>foo bar </i></i></p>
(My guess is that it first renders the </i> inside of the alt attribute, which is cleaned up in the attribute sanitizing, and then it discovers that there is a missing </i> and adds that in.)
In the original parser, wikitext list elements cannot appear in image captions. It would, of course, be very easy to just disable the wikitext list tokens in the lexer to provide the same behavior, but this seems a bit inconsistent as any other block element may appear in the caption. If we instead, in the parser, push/pop the current list context to a stack when entering/leaving an "inlined block", we can support lists inside the caption with expected behavior in this case:
* list [[File:example.jpg| * list item in image caption ]] * continuing outer list
It is up to the listener to decide what to do with the link caption. Since it is fully parsed the listening application must be prepared for this. In html output, the caption is rendered inside an 'alt' text, unless there is a 'frame' or 'thumb' option and no explicit 'alt' option (in which case the caption is completely ignored). So the listener should have the ability to toggle rendering of markup on and off in order to render the caption inside the alt attribute.
/Andreas