Thought I'd point out a couple of useful things I've come across when doing
regex work (in Python, but also in other languages):
1: The re.VERBOSE flag. Lets you write your regular expressions using
multiline strings (you'll have to escape whitespace, or use \s though), and
also add comments. Makes it a lot easier to understand what you've been
thinking when you come back to your code two months later to change it.
2: Using functions instead of strings as the replacement in sub(). If
you're looking to do a fair amount of conditional logic in your replacement,
it might be more easily written by having a function do it, rather than
attempt to do it all with a regex.
My $.02.
Cheers,
Morten
On Tue, Jun 28, 2011 at 7:23 AM, Bináris <wikiposta(a)gmail.com> wrote:
OK, then I make separate lines. The only issue is that
any
enhacement/correction will be more complicated this way (that is another
reason to put as many features in one line as possible).
2011/6/28 Marcin Cieslak <saper(a)saper.info>
Given the speed of fetching/storing pages I don't think that speed of the
regular expression makes any difference. Running two compiled RE's
one after the other in sequence on the page text should be very fast.
--
Bináris
_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l