[Pywikipedia-l] problem with fixes.py

Francesco Cosoleto cosoleto at gmail.com
Fri Jun 19 09:01:34 UTC 2009


Hannes Röst ha scritto:
> Hello
> 
> I am writing for the first time and I don't quite know where the
> appropriate place is to write this. I am working on the German

Originally this mailing-list was named "pywikipediabot-users", nowadays 
it looks more as a devel mailing-list.

> wikipedia and I ran into some problems using fixes.py, specifically I
> had this edit: http://de.wikipedia.org/w/index.php?title=Deutsches_Reich_1933_bis_1945&diff=prev&oldid=61255346
> 
> the problem is here:
> (r'\bdeutsche(r|n|) Reich\b', r'Deutsche\1 Reich'),
> 
> It seems to be the case that \b does not work with the German eszett,
> whereas \< does work in my case. Should this be changed in all cases
> where \b is used? Do you have other suggestions?

I am surprised to see that. I guess that is because German eszett may be 
used in a different context. I am not sure it worth a bug report to 
Python, others software (like grep) don't work using this regexp either.

A possible workaround should be this:

ur'(?<!\xdf)\bdeutsche[rn] Reich\b'

-- 
Francesco Cosoleto

"Dunque nessuno indietro
si volti, verso le navi, dopo che ha udito l'appello,
ma andate avanti, l'un l'altro incitatevi,
se mai l'Olimpio Zeus, che il fulmine avventa, ci voglia concedere
di rintuzzare l'assalto, di ricacciare i nemici in città". (Omero)




More information about the Pywikipedia-l mailing list