Hello everyone,
TL:DR;
As you are aware from previous postings on this list [1] [2] [3]
[4] [5] [6], we have been progressively replacing Tidy with
RemexHtml on all wikis on the wikimedia cluster. As of today,
about 650 wikis have made the switch that include a number of
large wikis. We aim to complete this switch over on the remaining
250 wikis by end of June 2018. Another 40 or so wikis will be
switched on May 2nd.
There are a few large wikis (es, pt, uk, zh especially) that
could use more attention addressing Linter issues so that when we
make the switch end of June, some pages on these wiki don't render
differently from how they do now.
Details:
I started investigating more closely where the remaining large
wikis are with respect to the linter issues (high priority
categories on the Special:LintErrors page) that are pertinent to
these wikis. I am listing below results from running sql queries
on quarry.wmflabs.org for these wikis. If you are a community
member on any of these wikis, do try to address these on your
wiki.
See https://quarry.wmflabs.org/query/26474 for counts of linter
issues for each of the 9 categories in the main namespace.
English Wikipedia:
See https://quarry.wmflabs.org/query/25665 for counts of linter issues for each of the 9 categories in the main namespace.
English wp has been making slow and gradual progress. I think
overall, despite there still being ~8300 instances (not pages)
that need fixing, enwp is in pretty good shape for replacing Tidy
by end of June.
Commons:
See https://quarry.wmflabs.org/query/25693 for counts of linter issues for each of the 9 categories in the File (ns6), Gallery (ns0), and Template (ns10) namespaces.
The vast majority of html5-misnesting errors on commons seem to come from the use of the {{lang}} template which uses a <span> tag to wrap content. However, it seems to be extremely common to pass content with paragraphs into the {{lang}} template. Right now, this doesn't cause any visible rendering issues and could be ignored temporarily, but we strongly recommend fixing lang to use <div> or on pages which misuse {{lang}} this way, replace use of {{lang}} by creating a new template ({{lang-block}} maybe?) that uses a <div> tag.