[Pywikipedia-l] [ pywikipediabot-Bugs-1804008 ] showDiff() highlighting limitation due to difflib design
SourceForge.net
noreply at sourceforge.net
Sat Apr 11 09:31:49 UTC 2009
Bugs item #1804008, was opened at 2007-09-28 09:35
Message generated for change (Comment added) made by nicdumz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1804008&group_id=93107
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Francesco Cosoleto (cosoleto)
Assigned to: Francesco Cosoleto (cosoleto)
Summary: showDiff() highlighting limitation due to difflib design
Initial Comment:
showDiff() can fail to highlight a char-by-char difference because Python difflib seems don't support fully char-by-char comparison.
Please see in Python tracker:
* issue #1528074: "difflib.SequenceMatcher.find_longest_match() wrong result" (http://bugs.python.org/issue1528074)
* issue #1678345: "A fix for the bug #1528074 [warning: quite slow]" (http://bugs.python.org/issue1678345)
----------------------------------------------------------------------
>Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2009-04-11 11:31
Message:
Actually, I'd very much like to see better diff support for pywikipedia. I
dont know why I missed that bug =)
I see in those bugs several comments about complexity changes, saying that
a patch could change complexity from O(n*m) to O(n+m), which certainly
looks interesting. If char-by-char comparison provides better diffs, at a
lower cost, what exactly is the reason for not supporting in Python? :s
Two things to look at during implementation:
* Would it provide interesting diffs for all cases? (if one case is
improved while other matches get worse, it's not so interesting anymore)
* Performance changes for big diffs.
Good luck =)
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto)
Date: 2009-04-11 11:18
Message:
Assigned before somebody certainly steals this issue to me. I am going to
add a modified difflib version. Unless the lack of feature is fixed in
recent Python builds or, of course, anyone makes an objection. I am not
sure about a config option to enable or disable line-by-line/char-by-char
comparision.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-01-25 22:38
Message:
Logged In: NO
Guess this is an example
http://bildr.no/view/146822
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto)
Date: 2007-09-28 09:38
Message:
Logged In: YES
user_id=181280
Originator: YES
File Added: difflib_test.py
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1804008&group_id=93107
More information about the Pywikipedia-l
mailing list