2010/12/10 Giftpflanze <m.p.roppelt@web.de>

> Gahhh, this list. Nobody suggested just using Python's Twisted?[1] So
> much easier than trying to write your own script in Python using
> sockets and manual pongs and all that jazz.
I'm going to drag as deep as I can into http://krondo.com/?p=1209. Thanks for suggestion. This will help me into the second step: and now that I have my clean parsed #irc message... how can I use it for my tasks, sometimes simple, sometimes far from simple, while listening for other messages? I'd try a DIY (do it yourself)  way... but I guess that it's not so an exotic problem, nad that's much better to study a little bit. 

 
Here’s my RE that parses the RC IRC message in all aspects I know of:

The first line splits the server line into the actual IRC message and
the channel (i.e. wiki) it is coming from. The sending nick is ignored
since noone is allowed to talk at all and because it may change.

The second splits the message into its 6 constituent parts. That works
for every single line at the moment (sometimes a detail changes and we
are left with a mess), be it even a log entry and not an ordinary edit,
because the surrounding markup is present at every line. Sometimes the
message is too long for the IRC format (which allows for 512 bytes
including the final \r\n), so beware of cut off lines.

The REs are in the re_syntax(n) Tcl-style format (since this is taken
from my MediaWiki Tcl Library [~gifti/bot/irc.tcl]) but can easily be
adopted to other languages I assume. I use \003 and \002 instead of
direct ASCII for better readability and transportability. Consider that
the color codes are sometimes with leading zeros, sometimes not.

regexp {:[^ ]+ PRIVMSG #([^ ]+) :(.*?)} $line -> channel message

regexp {\00314\[\[\00307(.*)\00314\]\]\0034 (.*)\00310 \00302(.*)\003 \0035\*\003 \00303(.*)\003 \0035\*\003 \(*\002*\+*([^)]*)\002*\)* \00310(.*?)\003*} $message -> title action url user bytes comment

VERY interesting, thank you!

Alex