Revision: 4995 Author: wikipedian Date: 2008-02-09 23:08:43 +0000 (Sat, 09 Feb 2008)
Log Message: ----------- Fixed bug that lead to such strange messages:
BUG>> title Page (Arizona) ([[pt:Page (Arizona)]]) not found in list Expected one of: [[pt:Page (Arizona)]]
This is caused by a useless left-to-right marker in http://de.wikipedia.org/w/index.php?title=Page_%28Arizona%29&action=edit... and other pages, just between pt: and Page. These markers are now automatically removed in the Page constructor.
Modified Paths: -------------- trunk/pywikipedia/pagegenerators.py trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/pagegenerators.py =================================================================== --- trunk/pywikipedia/pagegenerators.py 2008-02-09 22:24:26 UTC (rev 4994) +++ trunk/pywikipedia/pagegenerators.py 2008-02-09 23:08:43 UTC (rev 4995) @@ -529,9 +529,11 @@ if index is None: raise ValueError(u'Unknown namespace: %s' % ns) namespaces[i] = index + rTMP = re.compile(r'\d') for page in generator: - if page.namespace() in namespaces: - yield page + if rTMP.search(page.title()) == None: + if page.namespace() in namespaces: + yield page
def RedirectFilterPageGenerator(generator): """
Modified: trunk/pywikipedia/wikipedia.py =================================================================== --- trunk/pywikipedia/wikipedia.py 2008-02-09 22:24:26 UTC (rev 4994) +++ trunk/pywikipedia/wikipedia.py 2008-02-09 23:08:43 UTC (rev 4995) @@ -325,6 +325,8 @@ # Replace underscores by spaces, also multiple spaces and underscores with a single space # Strip spaces at both ends t = re.sub('[ _]+', ' ', t).strip() + # Remove left-to-right and right-to-left markers. + t = re.sub(u'\u200e|\u200f', '', t) # leading colon implies main namespace instead of the default if t.startswith(':'): t = t[1:]