Hi Parsoid developers,
I have compared Wikipedia HTML and Parsoid HTML (same title and oldid)
for 500 random samples. And I found some bug examples and difference
patterns that may help you. We also expect the bugs to be fixed. Thanks!
Below are the examples:
Bug examples:
1. In
http://parsoid-lb.eqiad.wikimedia.org/enwiki/1913_Gettysburg_reunion?oldid=…,
References 18 is “(Pennsylvania Department of Health).
http://books.google.com/books?id=swkTAAAAYAAJ&pg=PA72PA72. Retrieved
2011-02-06.”. But in
http://en.wikipedia.org/w/index.php?title=1913_Gettysburg_reunion&oldid…1478,
it’s “(Pennsylvania Department of Health). Retrieved 2011-02-06.”
2. The first external link in
http://en.wikipedia.org/w/index.php?title=...From_the_Hungry_i&oldid=55…
“The Kingston Trio Liner Notes album entry.”, but in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/...From_the_Hungry_i?oldid=555…
“[
http://www.lazyka.com/linernotes/trio_01(Guard,Rynolds,Shane)/recrdngs/LP_T…:
The Kingston Trio Liner Notes album entry.]”. It’s an obvious bug.
3. In
http://en.wikipedia.org/w/index.php?title=1973_CARIFTA_Games&oldid=4733…0600,
every table have title line: “Event Gold Silver Bronze”. But in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/1973_CARIFTA_Games?oldid=47338…,
the table title line disappears.
4. In
http://en.wikipedia.org/w/index.php?title=Airdisco_Phi-Phi&oldid=551648…8808,
there are a table on the right: “Phi-Phi … Number built 1”. But it
disappers in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/Airdisco_Phi-Phi?oldid=5516488…
.
5. The figcaption not displays in wikipedia, but displays in parsoid.
Example 1: see “Breg , the old part of Novo Mesto along the Krka River” in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/%C5%A0entjo%C5%A1t,_Novo_Mesto…,
it not exist in
http://en.wikipedia.org/w/index.php?title=%C5%A0entjo%C5%A1t,_Novo_Mesto&am…2305.
Example 2: “T-6 Texan IIs over Columbus Mississippi” appears twice in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/14th_Operations_Group?oldid=57…
one time in
http://en.wikipedia.org/w/index.php?title=14th_Operations_Group&oldid=5…
.
6. The link “[1] [2] ...” in text or references disappears in Parsoid HTML.
Example1: see “[1] [2] [3] [4]” in
http://en.wikipedia.org/w/index.php?title=1982_PBA_Open_Conference&oldi…1559,
it disappears in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/1982_PBA_Open_Conference?oldid….
Example2: “[1]” in
http://en.wikipedia.org/w/index.php?title=2008%E2%80%9309_Barnsley_F.C._sea…5626,
disappears in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/2008%E2%80%9309_Barnsley_F.C._…
.
Other different patterns with examples:
1.
http://en.wikipedia.org/w/index.php?title=$pent&oldid=535219749 have the
table of contents. But
http://parsoid-lb.eqiad.wikimedia.org/enwiki/$pent?oldid=535219749 hasn’t.
2.
http://en.wikipedia.org/w/index.php?title=$pent&oldid=535219749 have
“[edit]” after each section to click. But
http://parsoid-lb.eqiad.wikimedia.org/enwiki/$pent?oldid=535219749 hasn’t.
3. The sign “^ ” in references of
http://en.wikipedia.org/w/index.php?title=$pent&oldid=535219749 is replaced
with “↑” in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/$pent?oldid=535219749.
4. The superscript “a b c d” etc in references of
http://en.wikipedia.org/w/index.php?title=%C3%87a_plane_pour_moi&oldid=…
replaced with “{num}.0 {num}.1 {num}.2 {num}.3” etc in
http://parsoid-lb.eqiad.wikimedia.org/enwiki/%C3%87a_plane_pour_moi?oldid=5…
5. The voice playing component may be different between
http://en.wikipedia.org/w/index.php?title=%C3%89tincelles_(Moszkowski)&…
Problems playing this file?) and
http://parsoid-lb.eqiad.wikimedia.org/enwiki/%C3%89tincelles_(Moszkowski)?o…
.
--
Bin Li