Hi Parsoid developers,

  I have compared Wikipedia HTML and Parsoid HTML (same title and oldid) for 500 random samples. And I found some bug examples and difference patterns that may help you. We also expect the bugs to be fixed. Thanks! Below are the examples:

Bug examples:
1. In http://parsoid-lb.eqiad.wikimedia.org/enwiki/1913_Gettysburg_reunion?oldid=581251478, References 18 is “(Pennsylvania Department of Health). http://books.google.com/books?id=swkTAAAAYAAJ&pg=PA72. Retrieved 2011-02-06.”. But in http://en.wikipedia.org/w/index.php?title=1913_Gettysburg_reunion&oldid=581251478, it’s “(Pennsylvania Department of Health). Retrieved 2011-02-06.”

2. The first external link in http://en.wikipedia.org/w/index.php?title=...From_the_Hungry_i&oldid=555958525 is “The Kingston Trio Liner Notes album entry.”, but in http://parsoid-lb.eqiad.wikimedia.org/enwiki/...From_the_Hungry_i?oldid=555958525 it’s “[http://www.lazyka.com/linernotes/trio_01(Guard,Rynolds,Shane)/recrdngs/LP_T1107.htm#.%20.%20.%20From%20the%20hungry%20i: The Kingston Trio Liner Notes album entry.]”. It’s an obvious bug.

3. In http://en.wikipedia.org/w/index.php?title=1973_CARIFTA_Games&oldid=473380600, every table have title line: “Event Gold Silver Bronze”. But in http://parsoid-lb.eqiad.wikimedia.org/enwiki/1973_CARIFTA_Games?oldid=473380600, the table title line disappears.

4. In http://en.wikipedia.org/w/index.php?title=Airdisco_Phi-Phi&oldid=551648808, there are a table on the right: “Phi-Phi … Number built 1”. But it disappers in http://parsoid-lb.eqiad.wikimedia.org/enwiki/Airdisco_Phi-Phi?oldid=551648808.

5. The figcaption not displays in wikipedia, but displays in parsoid. Example 1: see “Breg , the old part of Novo Mesto along the Krka River” in http://parsoid-lb.eqiad.wikimedia.org/enwiki/%C5%A0entjo%C5%A1t,_Novo_Mesto?oldid=542922305, it not exist in http://en.wikipedia.org/w/index.php?title=%C5%A0entjo%C5%A1t,_Novo_Mesto&oldid=542922305. Example 2: “T-6 Texan IIs over Columbus Mississippi” appears twice in http://parsoid-lb.eqiad.wikimedia.org/enwiki/14th_Operations_Group?oldid=572478542 but one time in http://en.wikipedia.org/w/index.php?title=14th_Operations_Group&oldid=572478542.

6. The link “[1] [2] ...” in text or references disappears in Parsoid HTML. Example1: see “[1] [2] [3] [4]” in http://en.wikipedia.org/w/index.php?title=1982_PBA_Open_Conference&oldid=582521559, it disappears in http://parsoid-lb.eqiad.wikimedia.org/enwiki/1982_PBA_Open_Conference?oldid=582521559. Example2: “[1]” in http://en.wikipedia.org/w/index.php?title=2008%E2%80%9309_Barnsley_F.C._season&oldid=561135626, disappears in http://parsoid-lb.eqiad.wikimedia.org/enwiki/2008%E2%80%9309_Barnsley_F.C._season?oldid=561135626.

Other different patterns with examples:
1. http://en.wikipedia.org/w/index.php?title=$pent&oldid=535219749 have the table of contents. But http://parsoid-lb.eqiad.wikimedia.org/enwiki/$pent?oldid=535219749 hasn’t.

2. http://en.wikipedia.org/w/index.php?title=$pent&oldid=535219749 have “[edit]” after each section to click. But http://parsoid-lb.eqiad.wikimedia.org/enwiki/$pent?oldid=535219749 hasn’t.

3. The sign “^ ” in references of http://en.wikipedia.org/w/index.php?title=$pent&oldid=535219749 is replaced with “¡ü” in http://parsoid-lb.eqiad.wikimedia.org/enwiki/$pent?oldid=535219749.

4. The superscript “a b c d” etc in references of http://en.wikipedia.org/w/index.php?title=%C3%87a_plane_pour_moi&oldid=582236844 is replaced with “{num}.0 {num}.1 {num}.2 {num}.3” etc in http://parsoid-lb.eqiad.wikimedia.org/enwiki/%C3%87a_plane_pour_moi?oldid=582236844

5. The voice playing component may be different between http://en.wikipedia.org/w/index.php?title=%C3%89tincelles_(Moszkowski)&oldid=555997335 (See Problems playing this file?) and http://parsoid-lb.eqiad.wikimedia.org/enwiki/%C3%89tincelles_(Moszkowski)?oldid=555997335.

--
Bin Li