On Sat, Mar 22, 2008 at 6:28 PM, Guy Van den Broeck <guyvdb(a)gmail.com> wrote:
I want to get some feedback on a possible Summer of
Code project proposal.
For last year's GSoC I created an HTML diffing library for Daisy CMS. The
algorithm has proven to work well and I'm thinking of porting it to
mediawiki.
What the algorithm does is take the source of 2 pages and merge them to
visualize the diff. The code I have already does something like this:
http://users.pandora.be/guyvdb/wikipediadiff.jpg
Is this a feasible project for wikimedia? I'm personally not very impressed
with the current "diff pages". I think a visual diff would bring that part
of mediawiki up to par with the rest of the software.
I agree that inline diffs would be nicer, instead of side-by-side.
Having it an HTML-rendered diff instead of a wikitext diff is useful
to some extent, but it hides information. It seems like it would be
relatively difficult to convey the fact that templates or images were
changed, for instance, and things like comments (which must be
included in diffs for proper usability) would also be an issue. Some
mechanism would have to be devised to convey that such invisible
changes took place. Possibly you could have an option to do a
wikitext diff instead, but that doesn't seem ideal to me. Doing it
one way that works well for everyone would be best if possible.
As for performance, please note that Wikimedia uses a diff engine
written in C++. One written in PHP would probably not be acceptable
on Wikipedia, from past experience (diffing used to eat a huge amount
of CPU). Scalability is also important, within reason: [[George W.
Bush]] is 128 KiB, for instance.