2008/8/18 Roan Kattouw roan.kattouw@home.nl:
Guy Van den Broeck schreef:
Hi,
I think the HTML diff page I've been developing for the Google Summer of Code is ready to be tested as an experimental feature. You enable it by setting $wgEnableHtmlDiff to true in r39564. What you'll see is a rendered version of the diff page with indications where words were added or removed. Image edits are supported too. Words that got a different style are underlined and you get an English (only, for now) explanation of what happened.
The interface is pretty basic and needs work. I'm not very good with cross browser stuff though. I can provide meta data in the HTML such as descriptions, id's, pointers to the previous and next change, etc. Usability can be enhanced by adding links that take you to the first or last change on the page, tool tips that open when clicking a change, or keyboard shortcuts that scroll through the changes. Help is appreciated in this department.
I spent a lot of time optimizing the code (include/HTMLDiff.php) for speed which makes the code less readable but performance is an issue. PHP is not my native tongue and the code would probably run faster if an expert took a look at it. I think the performance is pretty decent as it is (what do you expect from code that needs to parse 2 pages, diff every single word and keep everything in memory). The algorithm will probably choke on big pages (set your available memory high!).
I cleaned up the code a bit in r39585. I rewrote two loops, so that may influence performance (haven't done any tests or benchmarks). In the optimization department I can't really help you with more than these generic tips:
- Put wfProfileIn() and wfProfileOut() calls all over the place and do
some profiling to see which functions are bottlenecks
My experience is that wfProfile gives too much overhead for the diff code. There are just too many nested loops and the function call is pretty expensive. I use the XDEBUG profiler. I assume it is at least as accurate as wfProfile.
- If you're foreach()ing large arrays somewhere, try to use references: foreach($arr as $key => &$value) instead of foreach($arr as $key =>
$value) The latter makes a copy of $arr whereas the former doesn't. The former also allows you to change $value.
Doesn't make a significant difference here, added it anyway.
I'll start experimenting with HTMLDiff on my wiki now, input will follow.
Great! Is your wiki publicly available? I don't have a public test server of my own.
Roan Kattouw (Catrope)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l