Hi John and Risker,
First off, I do want to once again clarify that my intention in the
previous post was not to claim that VE/Parsoid is perfect. It was more
that we've fixed sufficient bugs at this point that the most significant
"bugs" (bugs, not missing features) that need fixing (and are being
fixed) are those that have to do with usability tweaks. My intention in
that post was also not one to put some distance between us and the
complaints, just to clarify that we are fixing things as fast as we can
and it can be seen in the recent changes stream.
John: specific answers to the edit diffs you highlighted in your post.
I acknowledge your intention to make sure we dont make false claims
about VE/Parsoid's usability. Thanks for taking the time for digging
them up. My answers below are made with an intention of figuring out
what the issues are so they can be fixed where they need to be.
On 07/23/2013 02:50 AM, John Vandenberg wrote:
On Tue, Jul 23, 2013 at 4:32 PM, Subramanya Sastry
<ssastry(a)wikimedia.org> wrote:
On 07/22/2013 10:44 PM, Tim Starling wrote:
Round-trip bugs, and bugs which cause a given
wikitext input to give
different HTML in Parsoid compared to MW, should have been detected
during automated testing, prior to beta deployment. I don't know why
we need users to report them.
500+ edits are being done per hour using Visual Editor [1] (less at this
time given that it is way past midnight -- I have seen about 700/hour at
times). I did go and click on over 100 links and examined the diffs. I did
that twice in the last hour. I am happy to report clean diffs on all edits
I checked both times.
I did run into a couple of nowiki-insertions which
is, strictly speaking not erroneous and based on user input, but is more a
usability issue.
What is a dirty diff? One that inserts junk unexpectedly,
unrelated
to the user's input?
That is correct. Strictly speaking, yes, any changes to the wikitext
markup that arose from what the user didn't change.
The broken table injection bugs are still happening.
https://en.wikipedia.org/w/index.php?title=Sai_Baba_of_Shirdi&curid=144…
If the parser isnt going to be fixed quickly to ignore tables it
doesnt understand, we need to find the templates and pages with these
broken tables - preferably using SQL and heuristics and fix them. The
same needs to be done for all the other wikis, otherwise they are
going to have the same problems happening randomly, causing lots of
grief.
This maybe related to this:
https://bugzilla.wikimedia.org/show_bug.cgi?id=51217 and I have a
tentative fix for it as of y'day.
VE and Parsoid devs have put in a lot and lot of effort to recognize
broken wikitext source, fix it or isolate it, and protect it across
edits, and roundtrip it back in original form to prevent corruption. I
think we have been largely successful but we still have more cases to go
that are being exposed here which we will fix. But, occasionally, these
kind of errors do show up -- and we ask for your patience as we fix
these. Once again, this is not a claim to perfection, but a claim that
this is not a significant source of corrupt edits. But, yes even a 0.1%
error rate does mean a big number in the absolute when thousands of
pages are being edited -- and we will continue to pare this down.
Yes, this is a dirty diff where Parsoid reformatted a 2-line image
wikitext source into one by removing a line break. Again, relative to
the # of edits being made, these are not frequent -- that was all that I
claimed, which I think is still true. But that said, this is a
relatively minor and harmless change.
You are correct, but this is not a dirty diff. I dont want to claim
this is an user error entirely -- but a combination of user and
software error.
Here is three edits to try to add a section header and
a sentence,
with a wikilink in the section header.
(In the process they added other junk into the page, probably unintentionally.)
https://en.wikipedia.org/w/index.php?title=Port_of_Davao&action=history…
What is the problem here exactly? (that is a question, not a
challenge). The user might have entered those newlines as well.
This is something I'll have to investigate.
This could be an enhancement to Parsoid. Thanks for the bug report :-).
That is all in the last hour, and I've only
checked ~100 diffs.
I appreciate that some of these are a result of user input, and the RC
feed of non-VE edits will have similar problems exhibited by newbies,
albeit different because it is the source editor. And it is great to
see so many constructive VE edits.
Thank you for acknowledging this!
But you're not going to get much
love by claiming that it is now stable and not causing broken diffs.
I did not make that specific claim, but it is possible that my tone was
more aggressive than I intended it to be. Just so there is no
confusion, VE/Parsoid can still cause dirty/broken diffs, but I think
the claim is that at this time, the vast majority of ongoing edits do
not corrupt wikitext source and where there are diffs, we have narrowed
that down to a couple of causes (where VE/Parsoid inserts nowiki
wrappers) which are being fixed.
In addition, VE can crash a Google Chrome tab, and it
can cause
(unsaved changes) dataloss in most browser configurations.
Subbu.