A question from an IRC/wikipedia newbie.
I've been experimenting with processing pubmsg events in irc://irc.wikimedia.org/en.wikipedia (thanks for the channel btw) and have been noticing some control characters that I wasn't expecting to see in the message content. I've attached a raw line form the channel, where you should be able to see an 0x03 byte (ctrl-c) at position 60. There are several others scattered throughout the line followed by integers. Is this a character encoding of some kind that I need to decode, or some artifact of the IRC protocol that I need to handle?
Any advice/tips would be greatly appreciated!
//Ed
On 14 October 2010 06:19, Ed Summers ehs@pobox.com wrote:
A question from an IRC/wikipedia newbie.
I've been experimenting with processing pubmsg events in irc://irc.wikimedia.org/en.wikipedia (thanks for the channel btw) and have been noticing some control characters that I wasn't expecting to see in the message content. I've attached a raw line form the channel, where you should be able to see an 0x03 byte (ctrl-c) at position 60. There are several others scattered throughout the line followed by integers. Is this a character encoding of some kind that I need to decode, or some artifact of the IRC protocol that I need to handle?
Those are color codes. See for example http://www.mirc.com/help/colors.html
-Niklas
On Thu, Oct 14, 2010 at 2:24 AM, Niklas Laxström niklas.laxstrom@gmail.com wrote:
Those are color codes. See for example http://www.mirc.com/help/colors.html
Ahh, thanks!
//Ed
Ed Summers wrote:
A question from an IRC/wikipedia newbie.
I've been experimenting with processing pubmsg events in irc://irc.wikimedia.org/en.wikipedia (thanks for the channel btw) and have been noticing some control characters that I wasn't expecting to see in the message content. I've attached a raw line form the channel, where you should be able to see an 0x03 byte (ctrl-c) at position 60. There are several others scattered throughout the line followed by integers. Is this a character encoding of some kind that I need to decode, or some artifact of the IRC protocol that I need to handle?
Any advice/tips would be greatly appreciated!
//Ed
They are color codes, you could simply strip them, but rather you should parse them, since they are really handy for splitting the fields.
Thanks for the help folks. If anyone is curious, here's a little python script I wrote that prints out the (parsed) edit stream:
//Ed
On Thu, Oct 14, 2010 at 6:31 PM, Platonides Platonides@gmail.com wrote:
Ed Summers wrote:
A question from an IRC/wikipedia newbie.
I've been experimenting with processing pubmsg events in irc://irc.wikimedia.org/en.wikipedia (thanks for the channel btw) and have been noticing some control characters that I wasn't expecting to see in the message content. I've attached a raw line form the channel, where you should be able to see an 0x03 byte (ctrl-c) at position 60. There are several others scattered throughout the line followed by integers. Is this a character encoding of some kind that I need to decode, or some artifact of the IRC protocol that I need to handle?
Any advice/tips would be greatly appreciated!
//Ed
They are color codes, you could simply strip them, but rather you should parse them, since they are really handy for splitting the fields.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Ed Summers <ehs <at> pobox.com> writes:
Thanks for the help folks. If anyone is curious, here's a little python script I wrote that prints out the (parsed) edit stream:
http://gist.github.com/628199
A year ago or so, some of us at #wikipedia-hu wrote a bot script that can react to events on the live RC channel, maybe you find it useful: http://pastebin.com/yvkdZchQ
wikitech-l@lists.wikimedia.org