Thanks for the help folks. If anyone is curious, here's a little
python script I wrote that prints out the (parsed) edit stream:
http://gist.github.com/628199
//Ed
On Thu, Oct 14, 2010 at 6:31 PM, Platonides <Platonides(a)gmail.com> wrote:
Ed Summers wrote:
A question from an IRC/wikipedia newbie.
I've been experimenting with processing pubmsg events in
irc://irc.wikimedia.org/en.wikipedia (thanks for the channel btw) and
have been noticing some control characters that I wasn't expecting to
see in the message content. I've attached a raw line form the channel,
where you should be able to see an 0x03 byte (ctrl-c) at position 60.
There are several others scattered throughout the line followed by
integers. Is this a character encoding of some kind that I need to
decode, or some artifact of the IRC protocol that I need to handle?
Any advice/tips would be greatly appreciated!
//Ed
They are color codes, you could simply strip them, but rather you should
parse them, since they are really handy for splitting the fields.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l