Bugs item #2079760, was opened at 2008-08-28 05:30
Message generated for change (Comment added) made by nicdumz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=207976…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Mikko Silvonen (silvonen)
Assigned to: Nobody/Anonymous (nobody)
Summary: Periods converted to percent signs in section links
Initial Comment:
Why did my interwiki.py edit
http://en.wikipedia.org/w/index.php?title=1st_Belorussian_Front&diff=23…
convert the link
[[de:Zentralfront#1._Wei.C3.9Frussische_Front]]
to
[[de:Zentralfront#1% Weirussische Front]]?
The correct decoded link would be:
[[de:Zentralfront#1. Weirussische Front]]
C:\svn\pywikipedia>python version.py
Pywikipedia [http] trunk/pywikipedia (r5854, Aug 27 2008, 21:32:58)
Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)]
----------------------------------------------------------------------
Comment By: NicDumZ Nicolas Dumazet (nicdumz)
Date: 2008-08-28 08:06
Message:
Logged In: YES
user_id=1963242
Originator: NO
r5856 ( patch by Jeremy B. ) should fix this :)
----------------------------------------------------------------------
Comment By: Jeremy Baron (jeremybaron)
Date: 2008-08-28 06:35
Message:
Logged In: YES
user_id=1669658
Originator: NO
I don't know the anchor encoding MediaWiki uses too well but I think this
fixes it. (patch below because I see no obvious way to attach. I know
there is a way, maybe I don't have sufficient privs.)
Rudimentary tests (before and after patch application):
In [7]: import wikipedia
Checked for running processes. 1 processes currently running, including
the current process.
In [8]: sectionlinktests =
('de:Zentralfront#1._Wei.C3.9Frussische_Front','a#.41.29');sectionlinktester
= lambda x: wikipedia.Page(wikipedia.getSite(),x).aslink()
In [9]: [(x,sectionlinktester(x)) for x in sectionlinktests]
Out[9]:
[('de:Zentralfront#1._Wei.C3.9Frussische_Front',
u'[[de:Zentralfront#1% Wei\xdfrussische Front]]'),
('a#.41.29', u'[[A#A)]]')]
In [10]: reload(wikipedia)
Checked for running processes. 2 processes currently running, including
the current process.
Out[10]: <module 'wikipedia' from
'/Users/jeremy/sandbox/mediawiki/pywikipediabot/pywikipedia/wikipedia.py'>
In [11]: [(x,sectionlinktester(x)) for x in sectionlinktests]
Out[11]:
[('de:Zentralfront#1._Wei.C3.9Frussische_Front',
u'[[de:Zentralfront#1. Wei\xdfrussische Front]]'),
('a#.41.29', u'[[A#A)]]')]
patch:
Index: pywikipedia/wikipedia.py
===================================================================
--- pywikipedia/wikipedia.py (revision 5855)
+++ pywikipedia/wikipedia.py (working copy)
@@ -228,6 +228,7 @@
Rwatchlist = re.compile(r"<input tabindex='[\d]+' type='checkbox'
"
r"name='wpWatchthis'
checked='checked'")
Rlink = re.compile(r'\[\[(?P<title>[^\]\|\[]*)(\|[^\]]*)?\]\]')
+resectiondecode = re.compile(r".(?=[0-9a-f]{2})",re.I)
class Page(object):
@@ -526,7 +527,7 @@
"""
section = self._section
if section and decode:
- section = section.replace('.', '%')
+ section = resectiondecode.sub('%',section)
section = url2unicode(section, self._site)
if not underscore:
section = section.replace('_', ' ')
btw, sourceforge strips all kinds of things out of bugspam, not just
german chars :-/
----------------------------------------------------------------------
Comment By: Jeremy Baron (jeremybaron)
Date: 2008-08-28 06:34
Message:
Logged In: YES
user_id=1669658
Originator: NO
I don't know the anchor encoding MediaWiki uses too well but I think this
fixes it. (patch below because I see no obvious way to attach. I know
there is a way, maybe I don't have sufficient privs.)
Rudimentary tests (before and after patch application):
In [7]: import wikipedia
Checked for running processes. 1 processes currently running, including
the current process.
In [8]: sectionlinktests =
('de:Zentralfront#1._Wei.C3.9Frussische_Front','a#.41.29');sectionlinktester
= lambda x: wikipedia.Page(wikipedia.getSite(),x).aslink()
In [9]: [(x,sectionlinktester(x)) for x in sectionlinktests]
Out[9]:
[('de:Zentralfront#1._Wei.C3.9Frussische_Front',
u'[[de:Zentralfront#1% Wei\xdfrussische Front]]'),
('a#.41.29', u'[[A#A)]]')]
In [10]: reload(wikipedia)
Checked for running processes. 2 processes currently running, including
the current process.
Out[10]: <module 'wikipedia' from
'/Users/jeremy/sandbox/mediawiki/pywikipediabot/pywikipedia/wikipedia.py'>
In [11]: [(x,sectionlinktester(x)) for x in sectionlinktests]
Out[11]:
[('de:Zentralfront#1._Wei.C3.9Frussische_Front',
u'[[de:Zentralfront#1. Wei\xdfrussische Front]]'),
('a#.41.29', u'[[A#A)]]')]
patch:
Index: pywikipedia/wikipedia.py
===================================================================
--- pywikipedia/wikipedia.py (revision 5855)
+++ pywikipedia/wikipedia.py (working copy)
@@ -228,6 +228,7 @@
Rwatchlist = re.compile(r"<input tabindex='[\d]+' type='checkbox'
"
r"name='wpWatchthis'
checked='checked'")
Rlink = re.compile(r'\[\[(?P<title>[^\]\|\[]*)(\|[^\]]*)?\]\]')
+resectiondecode = re.compile(r".(?=[0-9a-f]{2})",re.I)
class Page(object):
@@ -526,7 +527,7 @@
"""
section = self._section
if section and decode:
- section = section.replace('.', '%')
+ section = resectiondecode.sub('%',section)
section = url2unicode(section, self._site)
if not underscore:
section = section.replace('_', ' ')
btw, sourceforge strips all kinds of things out of bugspam, not just
german chars :-/
----------------------------------------------------------------------
Comment By: Mikko Silvonen (silvonen)
Date: 2008-08-28 05:38
Message:
Logged In: YES
user_id=127947
Originator: YES
Ouch, the SourceForge email system removes the German sharp s character
from the messages. See this issue on the web for the correct links.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=207976…