Speaking of exception log, I personally use https://gerrit.wikimedia.org/r/38252 to monitor it, unfortunately it's still not reviewed for everyone to use :P

On Thu, Oct 31, 2013 at 9:08 PM, Matthew Walker <mwalker@wikimedia.org> wrote:
With all my prep work completed ahead of time; I can get a CentralNotice LD out to both production branches in about 15 minutes (waiting on the Jenkins merge is the longest bit of that.) I watch both the fatal and exception logs whilst doing it and then quickly run through the patches to make sure it's all working.

I've felt pressured in the LD to get stuff out and myself out of the way when there have been more than two people in it -- which does correlate with my 15 minute estimate for the fastest I feel I can safely deploy.

~Matt Walker
Wikimedia Foundation
Fundraising Technology Team

On Thu, Oct 31, 2013 at 7:53 AM, Adam Baso <abaso@wikimedia.org> wrote:
Everyone, I apologize for the bug.

I'll look for ways to guard better against this risk in the future, which will be important as we look to expand coverage of Wikipedia Zero to sister projects and the desktop form factor.

Thanks to everyone for resolving the issue so quickly. You guys rule.

And Roan, thanks for not flipping over my desk, despite the bug making RL go haywire on Wikidata AND holding up your lightning deployment. It's true - you are a gentleman and a scholar.


On Wed, Oct 30, 2013 at 5:57 PM, Yuri Astrakhan <yastrakhan@wikimedia.org> wrote:
== Background ==
ZeroRatedMobileAccess has always depended on MobileFrontend and used it
liberally, including calls to its classes. However, it was done in hooks
called by MF so Zero simply stopped working in absence of MF. This,
however, changed in [1] where Zero started using a ResourceLoader module
from MF.

== What happened ==
At 23:02pm UTC, after deploying Zero extension updates, fatal monitor was
flooded with:

 -- Fatal error: Class 'MFResourceLoaderModule' not found in /usr/local/
line 408

The issue was tracked down to Wikidata having MobileFrontend disabled,
while ZeroRatedMobileAccess was enabled. It didn't impact page views
directly, however all load.php calls that requested the startup module
caused fatals because it attempted to instantiate MFResourceLoader class
and couldn't find it. As a consequence, people might have seen pages
without styles or scripts.

A number of people (MaxSem, Reedy, Roan, and Greg, and possibly others)
gave great assistance to track down the issue and rapidly disable the
ZeroRatedMobileAccess extension in Wikidata. Furthermore, mobile
configuration [2] will add an additional guard against calling
ZeroRatedMobileAccess.php unless it's explicitly within the context of MF.

Thank you to everyone!!!

== Timeline ==
All times in UTC

* 22:48 Zero 1.22wmf22 deployed, no errors
* 23:02 Zero 1.23wmf1 deployed, first errors appear - initially unnoticed
* 23:08 A small MobileFrontend change deployed
* 23:09 Errors noticed, initially linked with MobileFrontend push
* 23:17 Max reverts his MobileFrontend changes, errors don't go away
* 23:22 Problem narrowed down
* 23:27 Fix deployed

== Recomendations ==
* Allow a bit more time between deployments and observe fatalmonitor before
and after
* Ensure Zero extension checks if Mobile extension is loaded before
enabling itself if it relies on MFResourceLoader.

[1] https://gerrit.wikimedia.org/r/#/c/83133
[2] https://gerrit.wikimedia.org/r/#/c/92811
Wikitech-l mailing list

Mobile-l mailing list

Engineering mailing list