Hi,
I'm trying to get the same output for the references, from the Parsoid as with the PHP Parser. I use to make tests with the first reference of the following article: https://en.wikipedia.org/wiki/Kiwix
== Call ==
* PHP: 100+ languages<sup id="cite_ref-sourceforge_1-0" class="reference"><a href="#cite_note-sourceforge-1"><span>[</span>1<span>]</span></sup> * Parsoid (simplified): 100+ languages<span class="reference" id="cite_ref-sourceforge-1-0"><a href="#cite_note-sourceforge-1">[1]</a>
Parsoid uses a <span> tag instead a <sup> tag, I guess this is normal, but I'm not sure...
== References (multiple calls in this example) ==
* PHP: ^ <a href="#cite_ref-sourceforge_1-0"><sup><i><b>a</b></i></sup></a> <a href="#cite_ref-sourceforge_1-1"><sup><i><b>b</b></i></sup></a></span> <span class="reference-text"><span class="citation web"><a rel="nofollow" class="external text" href="http://sourceforge.net/projects/kiwix/">"Kiwix"</a>
* Parsoid (simplified): ↑ <a href="#cite_ref-sourceforge-1-0">1.0</a> <a href="#cite_ref-sourceforge-1-1">1.1</a></span><span><span class="citation web"><a href="http://sourceforge.net/projects/kiwix/" class="external">"Kiwix"</a>
The differences: * "↑", "^", seems to me strange... Why? * Each reference content is a number instead of a letter, for example "1.0" instead of "a". In Arabic this is not "a". BTW, in the arabic language the letter shouldn't be "a", but the first letter of the arabic alphabet. Here again I don't know if this is normal... * No styling for the references, for example "1.0" instead of <sup><i><b>a</b></i></sup>. Normal? * It seems to me the parsoid forgets to put a space just behind de "1.1" reference, so the text of reference is "stick/concatenated" visually. Seems to me to be a small bug.
Could you please tell me which of these things are features, and which one are bugs, so I can know what I should fix on my side?
Kind regards Emmanuel
Hi Emmanuel,
our Cite implementation is relatively simplistic at this point. Our highest priority is exposing the semantic reference information for editing and further processing.
We don't implement the extensive customization options in the PHP Cite extension [1]. Wherever possible, we are trying to avoid customizing / complicating the DOM structures in favor of CSS-based customizations. We have a bug for sup vs. span [2], although the HTML5 spec suggests using CSS rather than sup in non-semantic situations. Ideally future small tweaks to our DOM output should not affect the RDFa structures and attributes we are using to expose the semantic information. This lets consumers reprocess specific content like references any way they like without being wedded to every detail of our output structure.
The differences:
- "↑", "^", seems to me strange... Why?
The "↑" is the Cite default, "^" is a customization in the English Wikipedia, which should also be doable with CSS.
- Each reference content is a number instead of a letter, for example
"1.0" instead of "a". In Arabic this is not "a". BTW, in the arabic language the letter shouldn't be "a", but the first letter of the arabic alphabet. Here again I don't know if this is normal...
Maybe this could be handled with CSS counters and list-styles instead, but if not then we'll likely have to implement different numbering styles ourselves. The numbers have no semantic value, so this is easy to add later without breaking editing.
- No styling for the references, for example "1.0" instead of
<sup><i><b>a</b></i></sup>. Normal?
This should really be done in CSS.
- It seems to me the parsoid forgets to put a space just behind de "1.1"
reference, so the text of reference is "stick/concatenated" visually. Seems to me to be a small bug.
Indeed: https://bugzilla.wikimedia.org/show_bug.cgi?id=49314
Gabriel
[1]: http://www.mediawiki.org/wiki/Extension:Cite/Cite.php#Customization [2]: https://bugzilla.wikimedia.org/show_bug.cgi?id=43094 [3]: http://css-tricks.com/numbering-in-style/
Hi Gabriel,
Thank you. I will try to implement solutions based on these answers.
Regards Emmanuel
Le 07/06/2013 19:32, Gabriel Wicke a écrit :
Hi Emmanuel,
our Cite implementation is relatively simplistic at this point. Our highest priority is exposing the semantic reference information for editing and further processing.
We don't implement the extensive customization options in the PHP Cite extension [1]. Wherever possible, we are trying to avoid customizing / complicating the DOM structures in favor of CSS-based customizations. We have a bug for sup vs. span [2], although the HTML5 spec suggests using CSS rather than sup in non-semantic situations. Ideally future small tweaks to our DOM output should not affect the RDFa structures and attributes we are using to expose the semantic information. This lets consumers reprocess specific content like references any way they like without being wedded to every detail of our output structure.
The differences:
- "↑", "^", seems to me strange... Why?
The "↑" is the Cite default, "^" is a customization in the English Wikipedia, which should also be doable with CSS.
- Each reference content is a number instead of a letter, for example
"1.0" instead of "a". In Arabic this is not "a". BTW, in the arabic language the letter shouldn't be "a", but the first letter of the arabic alphabet. Here again I don't know if this is normal...
Maybe this could be handled with CSS counters and list-styles instead, but if not then we'll likely have to implement different numbering styles ourselves. The numbers have no semantic value, so this is easy to add later without breaking editing.
- No styling for the references, for example "1.0" instead of
<sup><i><b>a</b></i></sup>. Normal?
This should really be done in CSS.
- It seems to me the parsoid forgets to put a space just behind de "1.1"
reference, so the text of reference is "stick/concatenated" visually. Seems to me to be a small bug.
Indeed: https://bugzilla.wikimedia.org/show_bug.cgi?id=49314
Gabriel
wikitext-l@lists.wikimedia.org