Hello everyone,
As part of the effort to align Parsoid's output with the output of the
legacy parser [1], we introduced a possibly backwards incompatible change
[2] in the latest Parsoid version (0.17.0-a1, deployed along with
1.40.0-wmf.1).
Up until that version, while there was no guarantee that it was the case,
the "rel" attribute of HTML tags only had a single value. In particular, it
was possible to access all the external links of a page with
`a[rel="mw:ExtLink"]`. We updated our known clients [3], but did not
communicate that fact outside of the update of our Parsoid HTML
specification [4].
Selectors relying on the "rel" attribute of Parsoid HTML should
consequently be updated to take into account that they can (and do) now
contain space-separated multi-values, as specified in the HTML Living
Standard [5]. A list of what to check is provided in the client-side ticket
[3]. In most cases, a selector like `a[rel=”mw:ExtLink”]` just needs a
single character added: `a[rel~=”mw:ExtLink”]` will correctly match
multi-valued rel attributes.
We apologize for the late communication and the inconvenience.
Best regards,
Isabelle, for the Content Transform Team
[1]
https://phabricator.wikimedia.org/T186241
[2]
https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/822655
[3]
https://phabricator.wikimedia.org/T315209
[4]
https://www.mediawiki.org/wiki/Specs/HTML/2.6.0
[5]
https://html.spec.whatwg.org/multipage/semantics.html#attr-link-rel
--
*Isabelle Hurbain-Palatin* (she/her)
Senior Software Engineer
Wikimedia Foundation <https://wikimediafoundation.org/>