On Wed, Aug 19, 2015 at 1:22 PM, MZMcBride z@mzmcbride.com wrote:
Bartosz wrote:
We really do need this feature. Not anything else that Tidy does, most of its behavior is actually damaging, but we need to match the open and close tags to prevent the interface from getting jumbled.
My reading of this thread is that this is the consensus view. The problem, as I see it, is that Tidy has been deployed long enough that some users are also relying on all of its other bad behaviors. It seems to me that a replacement for Tidy either has to reimplement all of its unwanted behaviors to avoid breakage with current wikitext or it has to break an unknown amount of current wikitext.
My $0.02 from the peanut gallery: If we fixed up the bulk of the most common cases we can (where the bad HTML is not the result of an edit error), could we keep a Tidy/HTML5 type of thing around, but move it to edit validation rather than render output processing? We could start by leaving the current output-side code alone, and warning (to the user as a minor info blurb on edit submission, and in our logs) about edits that fail validation, so that we can get some idea of the scope and causes of the problem, fix what we can, and then evaluate whether we can eventually start flat-out rejecting the minority of edits that fail validation and then eventually remove the tidy on the output side. That ignores the whole problem of existing bad html already in the DB, of course, but that could probably be fixed with a one-time job...