[Labs-l] wiki replication stopped since September

Dan Andreescu dandreescu at wikimedia.org
Wed Nov 26 21:14:54 UTC 2014


Oops, I may have caused a bigger reaction than I intended.  Sean and team
know about this, and are trying to fix it.  But there are other nuances.
We've been working with Sean on designing a new schema which to replicate
data to.  The purpose is to make analysis work easier, and to isolate the
specific optimization and capacity problems and needs that analytics
products have.  I can only be vague as we're in the very early stages [1],
but I just wanted to point out that Sean is by no means ignoring this
problem.  Not only is he working on the production cluster (!), but also
he's currently addressing the data integrity problems I mentioned, and
*also* he's working with us to fix the root causes and higher level
architecture.

Things like this take a long time and two months of bad data does not imply
any kind of catastrophe for us at this point.  I'll also point out that
these problems are mostly due to bizarre bugs and database problems that
even the db engine authors don't seem to understand yet.  The products that
are impacted by these problems are products we care about, but not at the
same level as our tier 1 or tier 2 stuff.  Vital Signs is the most impacted
and it's a relatively new project that still has not replaced the
reportcard.  As we promote these projects to "ready", we're planning better
around them, as mentioned above.

[1] https://gerrit.wikimedia.org/r/#/c/167839/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20141126/9f5993e6/attachment.html>


More information about the Labs-l mailing list