Scott,
On 02/19/2013 03:50 PM, C. Scott Ananian wrote:
<div style="background:
#00FF00">-</div>
What's the plan to handle cases like this? Is it really important to
generate the in the output?
the is not important in wt2html mode, but should (ideally) be
preserved in wt2wt mode. Client DOM implementations don't preserve the
entity encoding, so we'd have to handle this with encoding-tolerant
attribute shadowing in the serializer. Handling all attribute
normalization cases explicitly would introduce a lot of complexity
however, so we prefer to hide these purely syntactic diffs in unmodified
content with selective serialization.
2) "Play a bit with r67090 and bug 3158"
This is a parsoid-only test which looks like:
In standard HTML serialization,   is encoded
uniformly as so
even if you wanted to be bug-compatible with the 'border :' style, you
should be emitting a not a   there. The other two cases are
whitespace normalization within attributes (again). I'm guessing jsdom
(incorrectly) did this by default whether you wanted it or not; you need
to explicitly add attribute-normalization into the domino case if that's
desired.
No normalization in the DOM makes round-tripping easier, and should lead
to the same DOM with HTML5-compliant parsers anyway.
3) "Parsoid-only: Table with broken attribute
value quoting on
consecutive lines"
jsdom used to insert the extraneous semicolon at the
end of the 'style'
attribute. domino does not. I believe this test case is broken and the
extraneous semicolon should be removed.
Same issue- CSS normalization in JSDOM vs. none in Domino.
The following are really bug reports.
* Other observed bugs & failures:
http://parsoid.wmflabs.org/en/Pi gives:
TypeError: Cannot assign to read only property 'ksrc' of #<KV>
at AttributeExpander._returnAttributes
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/ext.core.AttributeExpander.js:71:20)
at AttributeTransformManager.process
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:1017:8)
at AttributeExpander.onToken
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/ext.core.AttributeExpander.js:46:7)
at AsyncTokenTransformManager.transformTokens
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:568:17)
at AsyncTokenTransformManager.onChunk
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:356:17)
at SyncTokenTransformManager.EventEmitter.emit (events.js:96:17)
at SyncTokenTransformManager.onChunk
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.TokenTransformManager.js:904:7)
at PegTokenizer.EventEmitter.emit (events.js:96:17)
at PegTokenizer.process
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.tokenizer.peg.js:88:11)
at ParserPipeline.process
(/home/cananian/Projects/OLPC/Narrative/mediawiki/Parsoid/js/lib/mediawiki.parser.js:360:21)
http://localhost:8000/simple/Game gives:
starting parsing of Game
*********** ERROR: cs/s mismatch for node: A s: 3808; cs: 3821
************
completed parsing of Game in 1491 ms
* [[File:]] tag parsing for images appears to be incomplete:
a) alt= and class= are not parsed
b) 'thumb' and 'right' should result in <img class="thumb
tright" />
or some such, but there doesn't appear to be an indication of either
option in the parsoid output.
* I'd like to see title and revision information in the <head>
This is planned, but not yet implemented.
* Interwiki links are not converted to relative links
when the
"interwiki" is actually the current wiki. (Maybe this isn't really a bug.)
The PHP parser handles these as internal links, so we should do. We just
need to make sure that we preserve the prefix when round-tripping an
unmodified link target. The target is already shadowed, so I think the
prefix should already be preserved.
Gabriel