On Friday, April 15, 2016, MZMcBride <z(a)mzmcbride.com> wrote:
Max Semenik wrote:
Right now, MediaWiki has 2 pure-PHP engines to
produce diffs (there's also
a native PHP extension wikidiff2, but we're not discussing it right now):
* DairikiDiff is what everybody uses, and
* Wikidiff3, and alternative implementation by Guy Van den Broeck that was
around for 8 years but required a configuration change
While less battle-tested, Wikidiff3 offers vastly improved performance on
heavy diffs compared to DairikiDiff. The price, however, is that it makes
certain shortcuts if the diff is too complex. I ran through 100K diffs
from English Wikipedia, and 6% of diffs were different. Lots of changes
were seemingly insignificant but I need your help with determining if
it's really so.
Is there a related Phabricator Maniphest task about this? I'm not sure I
understand the motivation for making a switch. I would think that heavy
diffs are a very small portion of traffic.
I think optimizing the worst case performance makes sense, especially if we
dont really lose anything in doing so.
To clarify, this is just for third parties, right? Wmf uses wikidiff2.