Thanks Roan & Brad! We'll get back on track with wmf.1 deployments today :D
-Chad
On Wed, May 11, 2016 at 11:08 PM, Roan Kattouw rkattouw@wikimedia.org wrote:
TLDR: the bug is fixed and the errors have stopped.
I started working around this train hold by backporting the entire Echo extension from wmf1 to wmf23, assuming that the bug would be in MW core and updating Echo wouldn't affect it. Right after I deployed that, these errors started being thrown by wmf1 too.
It turned out that one of the Echo changes I backported stores the integer -1 in redis under some circumstances. RedisBagOStuff treats integers specially, in order to make incr() work: it stores them as plain numbers instead of PHP-serialized data. But when retrieving this value, the code didn't recognize -1 as a plain number because it didn't consist solely of digits ('-' is not a digit), so it thought it was PHP-serialized data and passed it to unserialize(), which caused the error. Apparently no one had ever tried to store a negative integer in redis (!) until my Echo change exposed the bug.
Brad did all the hard work, diagnosing this and writing up a fix on Phabricator. I turned that into a patch and deployed it about an hour ago. There haven't been any more errors since then.
On Wed, May 11, 2016 at 1:59 PM, Chad Horohoe chorohoe@wikimedia.org wrote:
Hi,
When we deployed the first 1.28 release to the cluster yesterday, we got a new error[0] relating to unserialization of redis data. It's pretty spammy already, so I'm paranoid about deploying wider until we figure out why. Deploying some debugging work soon so we can figure out what's going on.
If you've got any information you think would help, please chime in on the bug.
-Chad
[0] https://phabricator.wikimedia.org/T134923
Engineering mailing list Engineering@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/engineering