Scott,
On 02/19/2013 03:50 PM, C. Scott Ananian wrote:
<div style="background: #00FF00">-</div>
What's the plan to handle cases like this? Is it really important to generate the in the output?
the is not important in wt2html mode, but should (ideally) be preserved in wt2wt mode. Client DOM implementations don't preserve the entity encoding, so we'd have to handle this with encoding-tolerant attribute shadowing in the serializer. Handling all attribute normalization cases explicitly would introduce a lot of complexity however, so we prefer to hide these purely syntactic diffs in unmodified content with selective serialization.
- "Play a bit with r67090 and bug 3158"
This is a parsoid-only test which looks like:
In standard HTML serialization,   is encoded uniformly as so even if you wanted to be bug-compatible with the 'border :' style, you should be emitting a not a   there. The other two cases are whitespace normalization within attributes (again). I'm guessing jsdom (incorrectly) did this by default whether you wanted it or not; you need to explicitly add attribute-normalization into the domino case if that's desired.
No normalization in the DOM makes round-tripping easier, and should lead to the same DOM with HTML5-compliant parsers anyway.
- "Parsoid-only: Table with broken attribute value quoting on
consecutive lines"
jsdom used to insert the extraneous semicolon at the end of the 'style' attribute. domino does not. I believe this test case is broken and the extraneous semicolon should be removed.
Same issue- CSS normalization in JSDOM vs. none in Domino.
The following are really bug reports.
- Other observed bugs & failures:
http://parsoid.wmflabs.org/en/Pi gives:
TypeError: Cannot assign to read only property 'ksrc' of #<KV> at AttributeExpander._returnAttributes (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/ext.core.AttributeExpander.js:71:20) at AttributeTransformManager.process (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:1017:8) at AttributeExpander.onToken (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/ext.core.AttributeExpander.js:46:7) at AsyncTokenTransformManager.transformTokens (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:568:17) at AsyncTokenTransformManager.onChunk (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:356:17) at SyncTokenTransformManager.EventEmitter.emit (events.js:96:17) at SyncTokenTransformManager.onChunk (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:904:7) at PegTokenizer.EventEmitter.emit (events.js:96:17) at PegTokenizer.process (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.tokenizer.peg.js:88:11) at ParserPipeline.process (/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.parser.js:360:21)
http://localhost:8000/simple/Game gives:
starting parsing of Game *********** ERROR: cs/s mismatch for node: A s: 3808; cs: 3821 ************ completed parsing of Game in 1491 ms
- [[File:]] tag parsing for images appears to be incomplete: a) alt= and class= are not parsed b) 'thumb' and 'right' should result in <img class="thumb tright" />
or some such, but there doesn't appear to be an indication of either option in the parsoid output.
- I'd like to see title and revision information in the <head>
This is planned, but not yet implemented.
- Interwiki links are not converted to relative links when the
"interwiki" is actually the current wiki. (Maybe this isn't really a bug.)
The PHP parser handles these as internal links, so we should do. We just need to make sure that we preserve the prefix when round-tripping an unmodified link target. The target is already shadowed, so I think the prefix should already be preserved.
Gabriel