Thanks for the help folks. If anyone is curious, here's a little python script I wrote that prints out the (parsed) edit stream:
//Ed
On Thu, Oct 14, 2010 at 6:31 PM, Platonides Platonides@gmail.com wrote:
Ed Summers wrote:
A question from an IRC/wikipedia newbie.
I've been experimenting with processing pubmsg events in irc://irc.wikimedia.org/en.wikipedia (thanks for the channel btw) and have been noticing some control characters that I wasn't expecting to see in the message content. I've attached a raw line form the channel, where you should be able to see an 0x03 byte (ctrl-c) at position 60. There are several others scattered throughout the line followed by integers. Is this a character encoding of some kind that I need to decode, or some artifact of the IRC protocol that I need to handle?
Any advice/tips would be greatly appreciated!
//Ed
They are color codes, you could simply strip them, but rather you should parse them, since they are really handy for splitting the fields.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l