On Thu, May 29, 2014 at 1:34 PM, ENWP Pine <deyntestiss(a)hotmail.com> wrote:
Hi, I'm getting some 404 errors consistently when
trying to load some
English Wikipedia articles. Other pages load ok. Did something break?
TL;DR: A package update went badly.
Nitty-gritty postmortem:
At 20:25 (all times UTC), change Ie5a860eb9[0] ("Remove
wikimedia-task-appserver from app servers") was merged. There were two
things wrong with it:
1) The appserver package was configured to delete the mwdeploy and apache
users upon removal. The apache user was not deleted because it was logged
in, but the mwdeploy user was. The mwdeploy account was declared in Puppet,
but there was a gap between the removal of the package and the next Puppet
run during which the account would not be present.
2) The package included the symlinks /etc/apache2/wmf and
/usr/local/apache/common, which were not Puppetized. These symlinks were
unlinked when the package was removed.
Apache was configured to load configuration files from /etc/apache2/wmf,
and these include the files that declare the DocumentRoot and Directory
directives for our sites. As a result, users were served with 404s. At
20:40 Faidon Liambotis re-installed wikimedia-task-appserver on all
Apaches. Since 404s are cached in Varnish, it took another five minutes for
the rate of 4xx responses to return to normal (20:45).[1]
[0]:
https://gerrit.wikimedia.org/r/#/c/136151/
[1]:
https://graphite.wikimedia.org/render/?title=HTTP%204xx%20responses%2C%2020…