Bugs item #1804008, was opened at 2007-09-28 09:35 Message generated for change (Comment added) made by nicdumz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1804008...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 7 Private: No Submitted By: Francesco Cosoleto (cosoleto) Assigned to: Francesco Cosoleto (cosoleto) Summary: showDiff() highlighting limitation due to difflib design
Initial Comment: showDiff() can fail to highlight a char-by-char difference because Python difflib seems don't support fully char-by-char comparison.
Please see in Python tracker:
* issue #1528074: "difflib.SequenceMatcher.find_longest_match() wrong result" (http://bugs.python.org/issue1528074)
* issue #1678345: "A fix for the bug #1528074 [warning: quite slow]" (http://bugs.python.org/issue1678345)
----------------------------------------------------------------------
Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2009-04-11 11:31
Message: Actually, I'd very much like to see better diff support for pywikipedia. I dont know why I missed that bug =)
I see in those bugs several comments about complexity changes, saying that a patch could change complexity from O(n*m) to O(n+m), which certainly looks interesting. If char-by-char comparison provides better diffs, at a lower cost, what exactly is the reason for not supporting in Python? :s
Two things to look at during implementation: * Would it provide interesting diffs for all cases? (if one case is improved while other matches get worse, it's not so interesting anymore) * Performance changes for big diffs.
Good luck =)
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto) Date: 2009-04-11 11:18
Message: Assigned before somebody certainly steals this issue to me. I am going to add a modified difflib version. Unless the lack of feature is fixed in recent Python builds or, of course, anyone makes an objection. I am not sure about a config option to enable or disable line-by-line/char-by-char comparision.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody) Date: 2008-01-25 22:38
Message: Logged In: NO
Guess this is an example http://bildr.no/view/146822
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto) Date: 2007-09-28 09:38
Message: Logged In: YES user_id=181280 Originator: YES
File Added: difflib_test.py
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1804008...
pywikipedia-l@lists.wikimedia.org