https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
Web browser: --- Bug ID: 55329 Summary: showDiff() highlighting limitation due to difflib design Product: Pywikibot Version: unspecified Hardware: All OS: All Status: ASSIGNED Severity: normal Priority: Unprioritized Component: General Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: legoktm.wikipedia@gmail.com Classification: Unclassified Mobile Platform: ---
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/509/ Reported by: cosoleto Created on: 2007-09-28 07:35:32 Subject: showDiff() highlighting limitation due to difflib design Assigned to: cosoleto Original description: showDiff() can fail to highlight a char-by-char difference because Python difflib seems don't support fully char-by-char comparison.
Please see in Python tracker:
* issue #1528074: "difflib.SequenceMatcher.find_longest_match() wrong result" (http://bugs.python.org/issue1528074%5C)
* issue #1678345: "A fix for the bug #1528074 [warning: quite slow]" (http://bugs.python.org/issue1678345%5C)
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #1 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Logged In: YES user_id=181280 Originator: YES
File Added: difflib_test.py
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #2 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **priority**: 5 --> 6
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #3 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Logged In: NO
Guess this is an example http://bildr.no/view/146822
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #4 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Assigned before somebody certainly steals this issue to me. I am going to add a modified difflib version. Unless the lack of feature is fixed in recent Python builds or, of course, anyone makes an objection. I am not sure about a config option to enable or disable line-by-line/char-by-char comparision.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #5 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **priority**: 6 --> 7 - **assigned_to**: nobody --> cosoleto
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #6 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Actually, I'd very much like to see better diff support for pywikipedia. I dont know why I missed that bug =)
I see in those bugs several comments about complexity changes, saying that a patch could change complexity from O(n*m) to O(n+m), which certainly looks interesting. If char-by-char comparison provides better diffs, at a lower cost, what exactly is the reason for not supporting in Python? :s
Two things to look at during implementation: * Would it provide interesting diffs for all cases? (if one case is improved while other matches get worse, it's not so interesting anymore) * Performance changes for big diffs.
Good luck =)
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #7 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- I haven't need luck because I am not going to do big works, just silly adaptation of already written code (with loss of performance). If you are interested to work on this problem in a different way you are welcome (and not only in this open project). Anyway it's nice to see you have analysed the situation a bit.
The changed version should be safe, without regression cases. I will see to document performace loss.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #8 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **Group**: --> confirmed
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://sourceforge.net/p/p | |ywikipediabot/bugs/509
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
Strainu crangasi2001@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |crangasi2001@yahoo.com
--- Comment #9 from Strainu crangasi2001@yahoo.com --- This appears to have been fixed upstream, right?
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
--- Comment #10 from Andre Klapper aklapper@wikimedia.org --- Both links in comment 0 (http://bugs.python.org) have been fixed, indeed.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55329
xqt info@gno.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEW CC| |info@gno.de
pywikipedia-bugs@lists.wikimedia.org