On Wed, Aug 19, 2015 at 1:22 PM, MZMcBride <z(a)mzmcbride.com> wrote:
Bartosz wrote:
We really do need this feature. Not anything else
that Tidy does, most
of its behavior is actually damaging, but we need to match the open and
close tags to prevent the interface from getting jumbled.
My reading of this thread is that this is the consensus view. The problem,
as I see it, is that Tidy has been deployed long enough that some users
are also relying on all of its other bad behaviors. It seems to me that a
replacement for Tidy either has to reimplement all of its unwanted
behaviors to avoid breakage with current wikitext or it has to break an
unknown amount of current wikitext.
My $0.02 from the peanut gallery: If we fixed up the bulk of the most
common cases we can (where the bad HTML is not the result of an edit
error), could we keep a Tidy/HTML5 type of thing around, but move it
to edit validation rather than render output processing? We could
start by leaving the current output-side code alone, and warning (to
the user as a minor info blurb on edit submission, and in our logs)
about edits that fail validation, so that we can get some idea of the
scope and causes of the problem, fix what we can, and then evaluate
whether we can eventually start flat-out rejecting the minority of
edits that fail validation and then eventually remove the tidy on the
output side. That ignores the whole problem of existing bad html
already in the DB, of course, but that could probably be fixed with a
one-time job...