[Pywikipedia-l] [ pywikipediabot-Bugs-1504707 ] Replace.py's -fix:HTML breaks articles' formatting

SourceForge.net noreply at sourceforge.net
Sun Jan 27 03:55:33 UTC 2008


Bugs item #1504707, was opened at 2006-06-12 13:40
Message generated for change (Comment added) made by tavernier
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1504707&group_id=93107

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Replace.py's -fix:HTML breaks articles' formatting

Initial Comment:
While on most articles, -fix:HTML simply works, because
of the flaw in wikimarkup (wherein it is difficult to
distinguish between ''' as bolding, and ''' as
italicizing with a single apostrophe (a frequently used
construct)), when it comes to italics and bolding,
-fix:HTML can cause unintional bolding, a major problem.

See this diff:
http://en.wikipedia.org/w/index.php?title=Multiplication_table&diff=prev&oldid=57998433
Before:
http://en.wikipedia.org/w/index.php?title=Multiplication_table&oldid=57270331
After:
http://en.wikipedia.org/w/index.php?title=Multiplication_table&oldid=57998433

I think this could be fixed by scanning the region
defined by the previous and the next linebreak, and if
there are any other '''s, either using a <nowiki>
construct to protect the apostrophe from protection,
use HTML <i> tagsm or simply not do any replacements.

~maru

----------------------------------------------------------------------

Comment By: Tavernier (tavernier)
Date: 2008-01-27 04:55

Message:
Logged In: YES 
user_id=1705732
Originator: NO

it could be fixed by adding exceptions to the doReplacements method

i suggest 'comment', 'math', 'nowiki', 'pre' and 'source'

it will looks like

    def doReplacements(self, original_text):
        """
        Returns the text which is generated by applying all replacements
to the
        given text.
        """
        new_text = original_text
        exceptions = ['comment', 'math', 'nowiki', 'pre', 'source']
        if self.exceptions.has_key('inside-tags'):
            exceptions += self.exceptions['inside-tags']
        if self.exceptions.has_key('inside'):
            exceptions += self.exceptions['inside']
        for old, new in self.replacements:
            new_text = wikipedia.replaceExcept(new_text, old, new,
exceptions, allowoverlap = self.allowoverlap)
        return new_text

----------------------------------------------------------------------

Comment By: Rotem Liss (rotemliss)
Date: 2007-11-25 14:05

Message:
Logged In: YES 
user_id=1327030
Originator: NO

This is a "bug" in MediaWiki (or in the text) and isn't related to pre or
nowiki tags. The problem was that there was an invalid <i> tag, while the
five apostrophes already made the text italic. Changing it would cause the
same problems in a line that starts with a space and in a regular line.
About nowiki and pre, these are already scanned and the text inside them is
ignored (space before line is not scanned, though it's possible, but such
scan is not needed, as tags in such line are parsed, unlike pre or nowiki).
This is not a bug in the framework - I think it wasn't a bug when reported,
and it's definitely not a bug now. The problem is of the original text or
MediaWiki.

----------------------------------------------------------------------

Comment By: Russell Blau (russblau)
Date: 2007-11-21 16:12

Message:
Logged In: YES 
user_id=855050
Originator: NO

The specific bug identified on "Multiplication table" no longer exists. 
Is there still a problem?  If not, this bug can be closed.

----------------------------------------------------------------------

Comment By: siebrand (siebrand)
Date: 2007-04-26 21:29

Message:
Logged In: YES 
user_id=1107255
Originator: NO

Please let us know if this bug report is still applicable to the current
code. If no response is given, the bug report will be closed one month from
now. This message was added in an effort to reduce the number of open
issues on this project. Siebrand

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1504707&group_id=93107



More information about the Pywikipedia-l mailing list