[Toolserver-l] Two beginner questions

Giftpflanze m.p.roppelt at web.de
Fri Dec 10 06:12:07 UTC 2010


MZMcBride schrieb:
> Alex Brollo wrote:
> > 2. The script bring(s) into life a python bot, who reads 
> > RecentChanges at 10 minutes intervals by a cron routine. Is perhaps 
> > more efficient a #irc bot listening it.wikisource #irc channel for 
> > recent changes in your opinion? Where can I find a good python 
> > script to read #irc channels?
> 
> Gahhh, this list. Nobody suggested just using Python's Twisted?[1] So 
> much easier than trying to write your own script in Python using 
> sockets and manual pongs and all that jazz.

The process of IRC listening is not that dramatic, regardless of 
language. That could easily be made manually.

> You're more than welcome to look around my home directory (check 
> /home/mzmcbride/scripts/irc/) for some IRC bots. The bot I 
> specifically use to relay irc.wikimedia.org to irc.freenode.net is on 
> another server, but I'd be happy to post the code for you if you'd 
> like. His name is snitch and he supports all Wikimedia wikis, multiple 
> channels, and stalks per-page, per-user, or per-wiki.

Interesting.

Here’s my RE that parses the RC IRC message in all aspects I know of:

The first line splits the server line into the actual IRC message and 
the channel (i.e. wiki) it is coming from. The sending nick is ignored 
since noone is allowed to talk at all and because it may change.

The second splits the message into its 6 constituent parts. That works 
for every single line at the moment (sometimes a detail changes and we 
are left with a mess), be it even a log entry and not an ordinary edit, 
because the surrounding markup is present at every line. Sometimes the 
message is too long for the IRC format (which allows for 512 bytes 
including the final \r\n), so beware of cut off lines.

The REs are in the re_syntax(n) Tcl-style format (since this is taken 
from my MediaWiki Tcl Library [~gifti/bot/irc.tcl]) but can easily be 
adopted to other languages I assume. I use \003 and \002 instead of 
direct ASCII for better readability and transportability. Consider that 
the color codes are sometimes with leading zeros, sometimes not.

regexp {:[^ ]+ PRIVMSG #([^ ]+) :(.*?)} $line -> channel message

regexp {\00314\[\[\00307(.*)\00314\]\]\0034 (.*)\00310 \00302(.*)\003 \0035\*\003 \00303(.*)\003 \0035\*\003 \(*\002*\+*([^)]*)\002*\)* \00310(.*?)\003*} $message -> title action url user bytes comment

Giftpflanze



More information about the Toolserver-l mailing list