Some of the graphs [1] on the report card are not rendering due to what seems like some sort of EventLogging outage
When I query the database I get errors such as ""MySQL said: Table 'MobileWebEditing_7675117' is marked as crashed and should be repaired""
Let's keep an eye on this... [1] http://mobile-reportcard.wmflabs.org/#edits_daily-graphs-tab
See https://bugzilla.wikimedia.org/show_bug.cgi?id=64445
On Fri, Apr 25, 2014 at 1:06 PM, Jon Robson jrobson@wikimedia.org wrote:
Some of the graphs [1] on the report card are not rendering due to what seems like some sort of EventLogging outage
When I query the database I get errors such as ""MySQL said: Table 'MobileWebEditing_7675117' is marked as crashed and should be repaired""
Let's keep an eye on this... [1] http://mobile-reportcard.wmflabs.org/#edits_daily-graphs-tab
Sean sent this update earlier (before Jon noticed the issue) and agreed to let me forward it here.
db1047 has been upgraded to MariaDB 10, however it is not yet replicating from m2 because I ran into a replication bug on db1046, the m2 slave I had intended to make db1047's master for the log db. Until I get that glitch sorted I have db1047 federating the log tables instead using the new CONNECT engine, which seems fine so far.
The data on db1048 looks healthy (and is continuing to accumulate incoming events), which supports Dan's hunch that the issue lies with replication. So there is still nothing that suggests data loss. But once the migration is over, we should cross-reference the database with the log files and confirm this definitively.
On Fri, Apr 25, 2014 at 1:08 PM, Jon Robson jrobson@wikimedia.org wrote:
See https://bugzilla.wikimedia.org/show_bug.cgi?id=64445
On Fri, Apr 25, 2014 at 1:06 PM, Jon Robson jrobson@wikimedia.org wrote:
Some of the graphs [1] on the report card are not rendering due to what seems like some sort of EventLogging outage
When I query the database I get errors such as ""MySQL said: Table 'MobileWebEditing_7675117' is marked as crashed and should be repaired""
Let's keep an eye on this... [1] http://mobile-reportcard.wmflabs.org/#edits_daily-graphs-tab
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Are there any updates on this from Sean? It's not super critical at the moment; I just want to know what's going on and when we might expect the graphs and data to return to normal :)
On Fri, Apr 25, 2014 at 3:57 PM, Ori Livneh ori@wikimedia.org wrote:
Sean sent this update earlier (before Jon noticed the issue) and agreed to let me forward it here.
db1047 has been upgraded to MariaDB 10, however it is not yet replicating from m2 because I ran into a replication bug on db1046, the m2 slave I
had
intended to make db1047's master for the log db. Until I get that glitch sorted I have db1047 federating the log tables instead using the new CONNECT engine, which seems fine so far.
The data on db1048 looks healthy (and is continuing to accumulate incoming events), which supports Dan's hunch that the issue lies with replication. So there is still nothing that suggests data loss. But once the migration is over, we should cross-reference the database with the log files and confirm this definitively.
On Fri, Apr 25, 2014 at 1:08 PM, Jon Robson jrobson@wikimedia.org wrote:
See https://bugzilla.wikimedia.org/show_bug.cgi?id=64445
On Fri, Apr 25, 2014 at 1:06 PM, Jon Robson jrobson@wikimedia.org wrote:
Some of the graphs [1] on the report card are not rendering due to what seems like some sort of EventLogging outage
When I query the database I get errors such as ""MySQL said: Table 'MobileWebEditing_7675117' is marked as crashed and should be repaired""
Let's keep an eye on this... [1] http://mobile-reportcard.wmflabs.org/#edits_daily-graphs-tab
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Are there any updates on this from Sean? It's not super critical at the moment; I just want to know what's going on and when we might expect the graphs and data to return to normal :)
The latest I know is that the mobile-reportcard dashboards are updating again. That means db1047/log is ok to query. From what I can see there are data gaps but maybe Ori can shed more light on that. I'll try pinging him
Is there a bug/ticket tracking this other than the one filed under MobileFrontend? If not, perhaps we should reassign the MobileFrontend bug to a more appropriate place.
On Wed, Apr 30, 2014 at 2:19 PM, Dan Andreescu dandreescu@wikimedia.orgwrote:
Are there any updates on this from Sean? It's not super critical at the
moment; I just want to know what's going on and when we might expect the graphs and data to return to normal :)
The latest I know is that the mobile-reportcard dashboards are updating again. That means db1047/log is ok to query. From what I can see there are data gaps but maybe Ori can shed more light on that. I'll try pinging him
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
Is there a bug/ticket tracking this other than the one filed under MobileFrontend? If not, perhaps we should reassign the MobileFrontend bug to a more appropriate place.
We should report a bug for every separate issue, and I think this one is "Limn Dashboards not working since EventLogging migration".
There also do not seem to be any new entries being created in MobileWebUploads_8209043 which is slightly concerning...
On Wed, Apr 30, 2014 at 2:28 PM, Dan Andreescu dandreescu@wikimedia.org wrote:
Is there a bug/ticket tracking this other than the one filed under MobileFrontend? If not, perhaps we should reassign the MobileFrontend bug to a more appropriate place.
We should report a bug for every separate issue, and I think this one is "Limn Dashboards not working since EventLogging migration".
Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l
On Wed, Apr 30, 2014 at 3:05 PM, Jon Robson jdlrobson@gmail.com wrote:
There also do not seem to be any new entries being created in MobileWebUploads_8209043 which is slightly concerning...
Jon if you're looking on the usual log databse this is likely just the ~30 hours of lag. EventLogging was recently switched over to write to db1048 instead, and then federate to db1047 for analysis purposes (if I'm describing that correctly).
I've not seen any events logged on this table since 26th April. I know for sure there have been some since then as I triggered some on the 27th (I saw them in the network tab)... I'm using stat1003 I just want to rule out schema validation errors.
Let's give the database the time it needs to replicate and perform needed validation before we start troubleshooting other issues. I'm concerned that too many things are going on here.
Thanks to everyone who is working on this right now.
-Toby
On Wed, Apr 30, 2014 at 3:17 PM, Jon Robson jdlrobson@gmail.com wrote:
I've not seen any events logged on this table since 26th April. I know for sure there have been some since then as I triggered some on the 27th (I saw them in the network tab)... I'm using stat1003 I just want to rule out schema validation errors.
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
On Wed, Apr 30, 2014 at 3:05 PM, Jon Robson jdlrobson@gmail.com wrote:
There also do not seem to be any new entries being created in MobileWebUploads_8209043 which is slightly concerning...
I see entries in db1048. So, again: this is temporary lag in replication due to the setup being new.
Ori, thanks for checking and the reassurance and thanks for sorting this out guys! Keep us updated!
On Wed, Apr 30, 2014 at 3:27 PM, Ori Livneh ori@wikimedia.org wrote:
On Wed, Apr 30, 2014 at 3:05 PM, Jon Robson jdlrobson@gmail.com wrote:
There also do not seem to be any new entries being created in MobileWebUploads_8209043 which is slightly concerning...
I see entries in db1048. So, again: this is temporary lag in replication due to the setup being new.