Trying again from a different address
On 26 January 2016 at 01:04, Alex Monk amonk@wikimedia.org wrote:
Forwarding to wikitech-l since this is not really specific to staff, but all shell users.
On 25 January 2016 at 20:39, Ori Livneh ori@wikimedia.org wrote:
The X-Wikimedia-Debug header, for those of you who don't know, is an HTTP request header that you can set on your requests (either manually, or by using the Chrome[1] or Firefox[2] extensions). Requests bearing this header are always treated as cache misses by Varnish, and they are always routed to the same backend, mw1017.
In addition to handling X-Wikimedia-Debug requests, mw1017 is also configured as the sole application server backend for all requests to test.wikimedia.org. This was set up before X-Wikimedia-Debug existed, and as a debugging tool it is (IMO) inferior to it, because X-Wikimedia-Debug allows you to test code changes against any production wiki.
What I've seen happen before is developers (like me -- I've done this before) live-hack code on mw1017 to debug some issue that is only showing up in production. This can cause testwiki to break, which annoys developers and editors who use testwiki for testing things like Lua modules or editing functionality on mobile apps.
To reduce contention for mw1017, I propose that we do the following:
- Keep testwiki, but don't special-case it in Varnish
(in other words, have testwiki requests go to the standard app server pool)
Reserve mw1017 exclusively for X-Wikimedia-Debug requests
Add a service alias (appservers-debug.svc.eqiad.wmnet) for mw1017 and
update the varnish backend config to use that, rather than hard-code mw1017 in VCL.
Thoughts?
Engineering mailing list Engineering@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/engineering
-- Alex Monk VisualEditor/Editing team https://wikimediafoundation.org/wiki/User:Krenair_(WMF)
This is very timely and relevant to a proposal I am currently working on.
Everything you have proposed seems sensible and benefitial to me. Since you've breached the subject, I'm going to attempt to describe what I have been wishing for. I was literally in the middle of writing it up when I saw this email thread, so here it is, slightly pre-mature:
I'd like to extend the functionality of the wikimedia-debug header, or perhaps augment it with a similar but slightly different capability. I think it would be very useful to have a way to, in addition to cache-busting, also force the request to be served from the pre-production branch rather that the current production branch. This way changes on the prod+1 branch can be conveniently tested on any wiki (not just testwiki) while disregarding the version specified in wikiversions.
The X-Wikimedia-Debug header seems like the ideal way to implement this. If the header accepted multiple values then the new behavior could easily be implemented in The multiversion branch selection code, without affecting the previous behavior.
I'd like to hear feedback on this further proposal. I intended to propose this along with several related changes to multiversion and other relevant technical debt repayment / code cleanup, however, this one probably stands on it's own, especially in the context of what Ori has proposed.
On Monday, January 25, 2016, Alex Monk alex@wikimedia.org wrote:
Trying again from a different address
On 26 January 2016 at 01:04, Alex Monk <amonk@wikimedia.org javascript:_e(%7B%7D,'cvml','amonk@wikimedia.org');> wrote:
Forwarding to wikitech-l since this is not really specific to staff, but all shell users.
On 25 January 2016 at 20:39, Ori Livneh <ori@wikimedia.org javascript:_e(%7B%7D,'cvml','ori@wikimedia.org');> wrote:
The X-Wikimedia-Debug header, for those of you who don't know, is an HTTP request header that you can set on your requests (either manually, or by using the Chrome[1] or Firefox[2] extensions). Requests bearing this header are always treated as cache misses by Varnish, and they are always routed to the same backend, mw1017.
In addition to handling X-Wikimedia-Debug requests, mw1017 is also configured as the sole application server backend for all requests to test.wikimedia.org. This was set up before X-Wikimedia-Debug existed, and as a debugging tool it is (IMO) inferior to it, because X-Wikimedia-Debug allows you to test code changes against any production wiki.
What I've seen happen before is developers (like me -- I've done this before) live-hack code on mw1017 to debug some issue that is only showing up in production. This can cause testwiki to break, which annoys developers and editors who use testwiki for testing things like Lua modules or editing functionality on mobile apps.
To reduce contention for mw1017, I propose that we do the following:
- Keep testwiki, but don't special-case it in Varnish
(in other words, have testwiki requests go to the standard app server pool)
Reserve mw1017 exclusively for X-Wikimedia-Debug requests
Add a service alias (appservers-debug.svc.eqiad.wmnet) for mw1017 and
update the varnish backend config to use that, rather than hard-code mw1017 in VCL.
Thoughts?
Engineering mailing list Engineering@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Engineering@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/engineering
-- Alex Monk VisualEditor/Editing team https://wikimediafoundation.org/wiki/User:Krenair_(WMF)
On Tue, Jan 26, 2016 at 11:58 PM, Mukunda Modell mmodell@wikimedia.org wrote:
I think it would be very useful to have a way to, in addition to cache-busting, also force the request to be served from the pre-production branch rather that the current production branch. This way changes on the prod+1 branch can be conveniently tested on any wiki (not just testwiki) while disregarding the version specified in wikiversions.
Well, there is a way: you can edit /srv/mediawiki/wikiversions.php on mw1017 to change the mapping of wikis to branches, and set the X-Wikimedia-Debug header to ensure your request gets handled by mw1017. Making this more convenient would be very risky, because it would mean that two different versions of the code are transacting with data on shared storage backends, each with the presumption of being the only game in town. And this state could be triggered by anyone, with no !logging or coordination.
On Wed, Jan 27, 2016 at 2:17 AM, Ori Livneh ori@wikimedia.org wrote:
Well, there is a way: you can edit /srv/mediawiki/wikiversions.php on mw1017 to change the mapping of wikis to branches, and set the X-Wikimedia-Debug header to ensure your request gets handled by mw1017. Making this more convenient would be very risky, because it would mean that two different versions of the code are transacting with data on shared storage backends, each with the presumption of being the only game in town. And this state could be triggered by anyone, with no !logging or coordination.
We really shouldn't be making the assumption that only a single version is running at any given time. We are moving towards gradual / canary deployments, with a portion of traffic hitting a new branch while the remainder hits the old branch. It's currently safe to assume that any version is the only game in town and it's only going to get worse.
Anyone who is currently operating on that kind of assumption is being reckless. Any time new code is changing assumptions about the structure of shared storage, it needs to be rigorously reviewed and carefully protected from premature release. Deployments aren't at all atomic now, as far as I know they never have been.
As a release engineer I see it as a failure on the part of my team if we allow code to go out without proper gating of potentially destructive or disruptive storage format changes.
wikitech-l@lists.wikimedia.org