Hi,
here's a more detailed report:
As part of an ongoing effort to simplify, cleanup and reduce code lines of the Apache cluster config, we we're planning to unify a lot of document roots for the "www"-portals into a single docroot, like here:
https://gerrit.wikimedia.org/r/#/c/90669/
looking at that we saw that the www.wikimedia.org portal wasn't handled by this, unlike the other portals, so to further unify this we merged:
https://gerrit.wikimedia.org/r/#/c/91195/
which "just" moved the existing config for www.wikimedia.org to the wwwportals.conf file.
But because this had an old " ServerAlias *.wikimedia.org" in it, which now changed in the order Apache goes through the config, it caused *.wikimedia.org URLs to redirect to wikimediafoundation.org
.. ServerAlias *.wikimedia.org .. RewriteRule ^/wiki/(.*)$ http://wikimediafoundation.org/wiki/$1 [R=301,L]
We reverted both of the recent changes after just a few minutes and synced Apache config and restarted them.
Users still reported problems though due to caching. So we started some Squid purging and Mark banned php content-type in varnish.
Additionally it turned out several Apaches didn't get restarted properly by apache-graceful-all, and using apache-fast-test with the pybal option we found some more that needed manual restarts a little while later.
For futher fixes Brandon banned by object size: did < bblack> !log varnish: banned 'obj.http.content-length == 33518' on text varnishes everywhere (extract2.php leakage). .. Reedy already prepared new patches to fix the root cause ..and apache-fast-test should check eqiad now instead of Tampa: https://gerrit.wikimedia.org/r/#/c/91270/
Sorry for breakage! Thank you for help!