Nicolas Dumazet wrote:
Trust me, learning how to use python regexp worth it :) (I learned reading this page : http://www.amk.ca/python/howto/regex/ )
I'm an experienced Perl programmer. I know regexps, even though I'm a beginner in Python.
In your case, replace.py will do the trick, with something like : "({{cite book[^}]*isbn\s*=\s*)(\d*)" "\1ISBN \2" for the first
Sorry, this only works in some cases. Consider a user who thinks the author name should appear in small caps:
{{cite book| author={{sc|Karl Marx}} | isbn=123 }}
Now your [^}]* will fail, because there is a (2nd level) template call between "{{cite book" and "isbn=". Even if it isn't impossible, it *is* a mess to get this right in regexps. That would be fine if it was done *once* in the Python code, but requiring the bot operator to get it right on the replace.py command line for every call, is not realistic.
Now I hear words of a pywikiparser - and that sounds promising. Will that take care of template call parsing? Where can I find it, to try it out?
In addition to this, I believe your regexp will fail where there are linebreaks between "{{cite book" and "isbn=", but this could be fixed by adding a command line option to replace.py that activates Python's re.compile(..., re.MULTILINE), akin to the existing option for re.IGNORECASE.