Hello all,
Let's move our MediaWiki deploy cycle to weekly instead of 2-week. This will also reduce the number of standing deployment windows throughout the week by having those projects/teams simply "ride the MediaWiki train."
== Current situation ==
Right now, a new version of MediaWiki is rolled out to the WMF cluster over a two-week period. You can see the general flow of how it works on this page describing the deploy schedule for the 1.22 release: https://www.mediawiki.org/wiki/MediaWiki_1.22/Roadmap#Schedule_for_the_deplo...
== What are the drawbacks of two-weeks? ==
These are mostly known by everyone so I'll just simply state the most obvious one :-)
It takes up to 2 weeks for new features/bug fixes to be rolled out to the various Wikimedia wikis.
== What would a one-week cycle look like? ==
This has been talked about a bit, including during the last In Town Week for WMF Engineering in late-February. I've coalesced on one proposal at: https://wikitech.wikimedia.org/wiki/Deployments/One_week
This seems like a reasonable approach to me. Please respond here or on the talk page with comments/suggestions/etc.
== Major benefit/goal with one-week cycle ==
Our current list of deployment windows during the week is pretty large, and it is not uncommon for a week to practically fill up with bug fix windows. If we moved to a weekly cycle then more of those bug-fixes could just roll out with the normal MediaWiki deploy.
== Goal Timeline? ==
I would love to get us switched over to a one-week cycle by mid-June.
Greg
Extrapolating from our experience from two-week deploy cycles and what bugs we find at each stage, would moving to a one-week deploy cycle substantially increase the number of users exposed to really bad bugs?
For instance, if we're currently finding and fixing about 2 deployment blocker bugs per cycle after the mediawiki.org deploy but before the non-Wikipedias deploy, that data might imply that we should structure a 1-week cycle differently -- perhaps with three stages rather than two.
On Mon, May 6, 2013 at 12:20 PM, Sumana Harihareswara sumanah@wikimedia.org wrote:
Extrapolating from our experience from two-week deploy cycles and what bugs we find at each stage, would moving to a one-week deploy cycle substantially increase the number of users exposed to really bad bugs?
For instance, if we're currently finding and fixing about 2 deployment blocker bugs per cycle after the mediawiki.org deploy but before the non-Wikipedias deploy, that data might imply that we should structure a 1-week cycle differently -- perhaps with three stages rather than two.
I don't know what the stats are, but my intuition is that it's not generally that high. However, the ones we find are kinda big, so it still makes me nervous to go that route.
I've edited the page, called Greg's proposal "Alternative A", and added "Alternative B" which provides a little bit of time for catching issues on mw.org:
https://wikitech.wikimedia.org/wiki/Deployments/One_week
Rob
<quote name="Rob Lanphier" date="2013-05-06" time="12:50:18 -0700">
I've edited the page, called Greg's proposal "Alternative A", and added "Alternative B" which provides a little bit of time for catching issues on mw.org:
Take a look at the page now, a few things you'll notice:
0) (very very) basic stats on our current cycle.
1) Each proposal now has a deployments calendar (the wikitable) showing which day of the week there are deploys, and what happens during those deploys.
2) There is now also a lifespan picture (sorry for the glare!) show, well, the lifespan of each wmfXX branch on production servers along with "development time" (ie: how many days between each wmfXX branch creation (ie: how many days of development per branch)).
Take aways Option A * Devel time: 7 days * Lifespan: 10 days Option B * Devel time: 7 days * Lifespan: 14 days
3) There are now Pros/Cons tables for each option that outline what you think they outline.
4) Also, lastly, James_F kindly was the one to create the Talk: page with a clarification question on automated tests.
That's the highlight of changes for that page so you can go look at it again and see the interesting bits (if any) for you.
I'll reply separately with my recommendation.
Greg
On Mon, 06 May 2013 21:04:39 +0200, Greg Grossmeier greg@wikimedia.org wrote:
It takes up to 2 weeks for new features/bug fixes to be rolled out to the various Wikimedia wikis.
Four weeks, actually, if your change gets merged just after a branching point. (2 weeks until next branching + 2 weeks until it's deployed everywhere.) Needless to say I like this idea.
On Mon, May 6, 2013 at 3:04 PM, Greg Grossmeier greg@wikimedia.org wrote:
Let's move our MediaWiki deploy cycle to weekly instead of 2-week. This will also reduce the number of standing deployment windows throughout the week by having those projects/teams simply "ride the MediaWiki train."
\o/
It takes up to 2 weeks for new features/bug fixes to be rolled out to the various Wikimedia wikis.
3.5 weeks, actually. Consider if something was merged in the afternoon on April 29. It just missed getting into 1.22wmf3, so absent backporting it would have to wait for 1.22wmf4, which finishes being deployed everywhere on May 22.
This has been talked about a bit, including during the last In Town Week for WMF Engineering in late-February. I've coalesced on one proposal at: https://wikitech.wikimedia.org/wiki/Deployments/One_week
This seems like a reasonable approach to me. Please respond here or on the talk page with comments/suggestions/etc.
One other plan that was discussed in February, if I recall, was something like this:
week0, Thursday: Deploy wmf1 to test and mw.org week1, Monday: Deploy wmf1 to first-round wikis week1, Thursday: Deploy wmf1 to remaining wikis and wmf2 to test and mw.org. week2, Monday: Deploy wmf2 to first-round wikis week2, Thursday: Deploy wmf2 to remaining wikis and wmf3 to test and mw.org. etc.
That has the advantage of preserving the separation of test+mw.org from the more user-focused wikis, as Sumana mentioned.
-- Brad Jorsch Software Engineer Wikimedia Foundation
<quote name="Brad Jorsch" date="2013-05-06" time="15:32:19 -0400">
One other plan that was discussed in February, if I recall, was something like this:
week0, Thursday: Deploy wmf1 to test and mw.org week1, Monday: Deploy wmf1 to first-round wikis week1, Thursday: Deploy wmf1 to remaining wikis and wmf2 to test and mw.org. week2, Monday: Deploy wmf2 to first-round wikis week2, Thursday: Deploy wmf2 to remaining wikis and wmf3 to test and mw.org. etc.
That has the advantage of preserving the separation of test+mw.org from the more user-focused wikis, as Sumana mentioned.
Yep, Robla's creating that sample calendar now.
My reasoning for removing that initial separation that may be faulty:
We're running automated and non-automated tests against betalabs, and that has started producing results (bugs reported and fixed before it was ever included in a wmfXX branch). Also, the majority of bugs that are in the Highest/Immediate priority level (from my gut assessment, I don't have the data here) are found after a deploy to non-WP projects.
Thus, in a way, the non-WP projects really are our betatesters in all practicalities. That's my opinion at least. I'll work with Andre to get some numbers on this, if we can (date blocker bugs reported against our historic deploy calendar).
I'm pretty sure this email will have a mid-air collision with Robla and his writeup of the proposal you outlined, Brad.
Greg
On Mon, May 6, 2013 at 12:52 PM, Greg Grossmeier greg@wikimedia.org wrote:
Yep, Robla's creating that sample calendar now.
My reasoning for removing that initial separation that may be faulty:
We're running automated and non-automated tests against betalabs, and that has started producing results (bugs reported and fixed before it was ever included in a wmfXX branch). Also, the majority of bugs that are in the Highest/Immediate priority level (from my gut assessment, I don't have the data here) are found after a deploy to non-WP projects.
Thus, in a way, the non-WP projects really are our betatesters in all practicalities. That's my opinion at least. I'll work with Andre to get some numbers on this, if we can (date blocker bugs reported against our historic deploy calendar).
I'm pretty sure this email will have a mid-air collision with Robla and his writeup of the proposal you outlined, Brad.
I'm all for a move to a one-week cycle. I think it's a good sign that even discussing moving to a one-week cycle brings up a review of how and where we're catching critical bugs, ala the above comments from Greg, Sumana, etc. Weekly deployments pushing us all to be more diligent about this stuff sounds like a good thing, even if it may feel intense pre/post the switch in the short term.
On Mon, May 6, 2013 at 3:52 PM, Greg Grossmeier greg@wikimedia.org wrote:
I'm pretty sure this email will have a mid-air collision with Robla and his writeup of the proposal you outlined, Brad.
I think Robla was the one who suggested it in February, actually. I was just repeating it here.
<quote name="Brad Jorsch" date="2013-05-06" time="16:15:33 -0400">
On Mon, May 6, 2013 at 3:52 PM, Greg Grossmeier greg@wikimedia.org wrote:
I'm pretty sure this email will have a mid-air collision with Robla and his writeup of the proposal you outlined, Brad.
I think Robla was the one who suggested it in February, actually. I was just repeating it here.
Right, didn't mean to suggest otherwise.
Greg
On Mon, 2013-05-06 at 12:52 -0700, Greg Grossmeier wrote:
Also, the majority of bugs that are in the Highest/Immediate priority level (from my gut assessment, I don't have the data here) are found after a deploy to non-WP projects.
I agree with that impression: We don't get many (manually found) highest/immediate prio bug reports after the first deployment phase, most of them after phase 2, and a few after phase 3 (e.g. when we failed to understand the explosive force of an issue).
Backing that impression up with Bugzilla data: <tl;dr>: That's hard.
Long version: I tried a Bugzilla query for tickets created in the last four months, that at some point in their lifetime had Priority = {Highest | Immediate}, restricted it to the products {MediaWiki, MediaWiki extensions, Wikimedia}, made buglist.cgi display the "Opened" column (via "Change columns" at the bottom); dropped the last 9 characters of the "Opened" column (to get rid of the time and only have the date, though that's UTC so does not perfectly fit our deployment *time*), imported the resulting CSV into OOCalc, cumulated a bit, and summed up all those tickets that got filed in a certain deployment phase && at *some* point became highest/immediate. See attachment.
The results don't back up my impression. One potential reason: Development teams file tickets *at some point* and don't see priority immediately, and when tickets get triaged they get higher priority at some point later on. Maybe results would look different if I the query excluded reporters that are employees? Don't want to spend too much time trying though.
andre
On 05/07/2013 08:47 AM, Andre Klapper wrote:
On Mon, 2013-05-06 at 12:52 -0700, Greg Grossmeier wrote:
Also, the majority of bugs that are in the Highest/Immediate priority level (from my gut assessment, I don't have the data here) are found after a deploy to non-WP projects.
I agree with that impression: We don't get many (manually found) highest/immediate prio bug reports after the first deployment phase, most of them after phase 2, and a few after phase 3 (e.g. when we failed to understand the explosive force of an issue).
Backing that impression up with Bugzilla data: <tl;dr>: That's hard.
Long version: I tried a Bugzilla query for tickets created in the last four months, that at some point in their lifetime had Priority = {Highest | Immediate}, restricted it to the products {MediaWiki, MediaWiki extensions, Wikimedia}, made buglist.cgi display the "Opened" column (via "Change columns" at the bottom); dropped the last 9 characters of the "Opened" column (to get rid of the time and only have the date, though that's UTC so does not perfectly fit our deployment *time*), imported the resulting CSV into OOCalc, cumulated a bit, and summed up all those tickets that got filed in a certain deployment phase && at *some* point became highest/immediate. See attachment.
The results don't back up my impression. One potential reason: Development teams file tickets *at some point* and don't see priority immediately, and when tickets get triaged they get higher priority at some point later on. Maybe results would look different if I the query excluded reporters that are employees? Don't want to spend too much time trying though.
andre
Andre, thanks for trying to run the query on this. Yeah, we split, rename, reprioritize, and otherwise mess with our Bugzilla tickets so often that it'd probably be a huge travail to thoroughly research the question I asked, and that would delay this decision beyond this week. So I'm fine with going on people's rough estimates and impressions instead of going the "ALL DATA ALL THE TIME" route.
<quote name="Sumana Harihareswara" date="2013-05-07" time="10:20:05 -0400">
Backing that impression up with Bugzilla data: <tl;dr>: That's hard.
<snip>
The results don't back up my impression. One potential reason: Development teams file tickets *at some point* and don't see priority immediately, and when tickets get triaged they get higher priority at some point later on. Maybe results would look different if I the query excluded reporters that are employees? Don't want to spend too much time trying though.
Andre, thanks for trying to run the query on this. Yeah, we split, rename, reprioritize, and otherwise mess with our Bugzilla tickets so often that it'd probably be a huge travail to thoroughly research the question I asked, and that would delay this decision beyond this week. So I'm fine with going on people's rough estimates and impressions instead of going the "ALL DATA ALL THE TIME" route.
Yeah, there isn't any pattern in that graph as is and it would need some more intense work to get "right".
Thanks for trying, Andre!
Greg
Le 06/05/13 21:04, Greg Grossmeier a écrit :
Let's move our MediaWiki deploy cycle to weekly instead of 2-week.
As long as we tend to the hour, I am fine :-]
Weekly deploy is a bit of a challenge since we will have less time to figure out a solution for any blocking bug, but overall I think it will be a nice improvement to get less changes deployed.
I am also wondering how many manual tasks we have to handle when doing and after the deployment. If we identify them and write tools for it, that would make it later possible to do two deployment per week :-]
Greg Grossmeier wrote:
== What are the drawbacks of two-weeks? ==
These are mostly known by everyone so I'll just simply state the most obvious one :-)
It takes up to 2 weeks for new features/bug fixes to be rolled out to the various Wikimedia wikis.
This is not a bad thing, though. Two weeks is a helluva lot better than it could be (or has been). :-)
These conversations get a little murky when we're discussing deployments. I feel like there's a distinction between MediaWiki core and extension general fixes and upgrades versus new features and functionality being deployed to Wikimedia wikis. I think everyone agrees that the faster we can get bug fixes in, the better. But when it comes to triaging and prioritizing features and enabling certain features by default... does such a distinction exist outside my mind?
The reason I ask about a distinction is that there have been a lot of changes to Wikimedia wikis lately and likely more to come, as the Wikimedia Foundation has gotten larger and has more dedicated tech resources. Overall, this is great. But big new features come with big changes, and these changes sometimes need a bit of breathing room. I've read a lot of pushback lately against rapid changes (usurping usernames, getting rid of the new message indicator, etc.). A lot of this seems mostly outside the scope how often to deploy (and I don't want to shift the focus of this thread), but it gets confusing (to me, at least) to make a distinction between new code/features on Wikimedia wikis and how often to branch MediaWiki core/extensions.
MZMcBride
On Mon, May 6, 2013 at 7:20 PM, MZMcBride z@mzmcbride.com wrote:
The reason I ask about a distinction is that there have been a lot of changes to Wikimedia wikis lately and likely more to come, as the Wikimedia Foundation has gotten larger and has more dedicated tech resources. Overall, this is great. But big new features come with big changes, and these changes sometimes need a bit of breathing room. I've read a lot of pushback lately against rapid changes (usurping usernames, getting rid of the new message indicator, etc.). A lot of this seems mostly outside the scope how often to deploy (and I don't want to shift the focus of this thread), but it gets confusing (to me, at least) to make a distinction between new code/features on Wikimedia wikis and how often to branch MediaWiki core/extensions.
A lot of this could potentially be addressed in a consistent manner across wikis if we applied the alpha->beta->prod (or just beta->prod for starters) channel model that's used on the Wikimedia mobile sites. Then features (whether in core or extensions) could be flagged for alpha or beta readiness, and users would only get them if they've decided to opt into either of those channels. We could still flip the switch from beta->prod, but that decision could be decoupled from the weekly deployment cycle.
This would likely be done for features & changes which have significant user-facing impact, and where segregation into "on" and "off" modes is possible (not always the case).
We may want to consider at least putting some such scaffolding for beta->prod desktop modes into place before shifting to weekly deployments, although if that holds up this change significantly, I'd be in favor of making the shift first and then iterating.
Right now we have lots of individual "experimental" prefs, some dark launch URL parameters (&useNew=1 for the account creation forms etc.), some changes that are announced widely but then rolled out immediately (section edit link change), etc. What would be the disadvantage of having a single "I'd like the latest and greatest changes once they come in" preference for our users? The main disadvantage I see is that we'd need to temporarily retain two codepaths for significant user-facing changes, potentially increasing code complexity a fair bit, but perhaps reducing post-launch cost in return. And we'd need to consider more carefully if/when to make the beta/prod switch -- not necessarily a bad thing. ;-)
Have there been any negative experiences with this model on the mobile sites?
Erik
-- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation
No negative experiences at all Erik. The only real problems we've run into are people complaining they weren't aware of editing features that we pushed to mobile that users were not aware of due to not using mobile. I reckon this would be less of an issue in desktop.
I would ___ love ___ to see a stable, beta, alpha model on desktop ...
After reading a lot of the Echo feedback today I feel a lot of the problems, complaints about not hearing about it could have been addressed by being in a beta mode. It has also made innovation, experimentation, testing and transparency much easier than it would have been had we just pushed new features directly to large audiences.
Please help make this happen. I'm happy to talk more in detail with anyone who wants to implement this. On 6 May 2013 19:43, "Erik Moeller" erik@wikimedia.org wrote:
On Mon, May 6, 2013 at 7:20 PM, MZMcBride z@mzmcbride.com wrote:
The reason I ask about a distinction is that there have been a lot of changes to Wikimedia wikis lately and likely more to come, as the Wikimedia Foundation has gotten larger and has more dedicated tech resources. Overall, this is great. But big new features come with big changes, and these changes sometimes need a bit of breathing room. I've read a lot of pushback lately against rapid changes (usurping usernames, getting rid of the new message indicator, etc.). A lot of this seems mostly outside the scope how often to deploy (and I don't want to shift the focus of this thread), but it gets confusing (to me, at least) to
make
a distinction between new code/features on Wikimedia wikis and how often to branch MediaWiki core/extensions.
A lot of this could potentially be addressed in a consistent manner across wikis if we applied the alpha->beta->prod (or just beta->prod for starters) channel model that's used on the Wikimedia mobile sites. Then features (whether in core or extensions) could be flagged for alpha or beta readiness, and users would only get them if they've decided to opt into either of those channels. We could still flip the switch from beta->prod, but that decision could be decoupled from the weekly deployment cycle.
This would likely be done for features & changes which have significant user-facing impact, and where segregation into "on" and "off" modes is possible (not always the case).
We may want to consider at least putting some such scaffolding for beta->prod desktop modes into place before shifting to weekly deployments, although if that holds up this change significantly, I'd be in favor of making the shift first and then iterating.
Right now we have lots of individual "experimental" prefs, some dark launch URL parameters (&useNew=1 for the account creation forms etc.), some changes that are announced widely but then rolled out immediately (section edit link change), etc. What would be the disadvantage of having a single "I'd like the latest and greatest changes once they come in" preference for our users? The main disadvantage I see is that we'd need to temporarily retain two codepaths for significant user-facing changes, potentially increasing code complexity a fair bit, but perhaps reducing post-launch cost in return. And we'd need to consider more carefully if/when to make the beta/prod switch -- not necessarily a bad thing. ;-)
Have there been any negative experiences with this model on the mobile sites?
Erik
-- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Erik Moeller wrote:
What would be the disadvantage of having a single "I'd like the latest and greatest changes once they come in" preference for our users? The main disadvantage I see is that we'd need to temporarily retain two codepaths for significant user-facing changes, potentially increasing code complexity a fair bit, but perhaps reducing post-launch cost in return.
It would increase code complexity a lot, and it would make debugging more difficult. Don't forget that the desktop execution environment (front & back) is much more heterogenous -- you have gadgets, substantial interface customization via the MediaWiki NS, multiple skins, legacy interfaces, and a much wider set of extensions, all of which conspire to make bugs much harder to reproduce. I'm inclined to think that we already get most of the benefits this approach purports to confer from the general MediaWiki release cycle.
On Mon, May 6, 2013 at 7:43 PM, Erik Moeller erik@wikimedia.org wrote:
We may want to consider at least putting some such scaffolding for beta->prod desktop modes into place before shifting to weekly deployments, although if that holds up this change significantly, I'd be in favor of making the shift first and then iterating.
Right now we have lots of individual "experimental" prefs, some dark launch URL parameters (&useNew=1 for the account creation forms etc.), some changes that are announced widely but then rolled out immediately (section edit link change), etc. What would be the disadvantage of having a single "I'd like the latest and greatest changes once they come in" preference for our users? The main disadvantage I see is that we'd need to temporarily retain two codepaths for significant user-facing changes, potentially increasing code complexity a fair bit, but perhaps reducing post-launch cost in return. And we'd need to consider more carefully if/when to make the beta/prod switch -- not necessarily a bad thing. ;-)
Have there been any negative experiences with this model on the mobile sites?
For the most part, this model has been awesome for mobile. It allows us to iterate on feature ideas quickly, try a lot of different things, and focus on the things that really work. Getting a partially-polished feature in front of a group of users who expect to have things not always work 100% is incredibly valuable - and takes off the pressure of having to get something totally right. If it turns out the feature idea was a bust, it's generally no big deal for us (the mobile team or the users) to either can it or tweak it to make it better. Having an 'alpha' that is essentially a developer sandobx (little to no productization happens for our alpha) and some of the best mobile features have grown from here.
We've had occasional issues with beta or alpha features bleeding into a mode they weren't supposed to - though this is generally quick to fix. Ultimately, the issues we've had with the approach have been minor.
If this were something adopted by Mediawiki more generally, I would want to see it carefully built into core. Ori brings up a good point about increased code complexity, which is a guarantee with this kind of approach but could certainly be mitigated if the plumbing were well architected.
I think thoughtful development of something like this into core would be awesome and would allow us to collectively build better, more useful features, faster.
On Mon, May 6, 2013 at 7:43 PM, Erik Moeller erik@wikimedia.org wrote:
On Mon, May 6, 2013 at 7:20 PM, MZMcBride z@mzmcbride.com wrote:
The reason I ask about a distinction is that there have been a lot of changes to Wikimedia wikis lately and likely more to come, as the Wikimedia Foundation has gotten larger and has more dedicated tech resources. Overall, this is great. But big new features come with big changes, and these changes sometimes need a bit of breathing room. I've read a lot of pushback lately against rapid changes (usurping usernames, getting rid of the new message indicator, etc.). A lot of this seems mostly outside the scope how often to deploy (and I don't want to shift the focus of this thread), but it gets confusing (to me, at least) to make a distinction between new code/features on Wikimedia wikis and how often to branch MediaWiki core/extensions.
A lot of this could potentially be addressed in a consistent manner across wikis if we applied the alpha->beta->prod (or just beta->prod for starters) channel model that's used on the Wikimedia mobile sites. Then features (whether in core or extensions) could be flagged for alpha or beta readiness, and users would only get them if they've decided to opt into either of those channels. We could still flip the switch from beta->prod, but that decision could be decoupled from the weekly deployment cycle.
This would likely be done for features & changes which have significant user-facing impact, and where segregation into "on" and "off" modes is possible (not always the case).
I think this is awesome for features ... but if we're putting work into this, I would love even more to have a clustered a+b production environment, such that 10% of folks are put on the new release (cluster a) and then it gets pushed over to cluster b. Then we can also test performance in a real world environment, and breakages only happen for 10% (PS the 10% number was pulled out of thin air).
The opt-in beta is much too limited, as well as being inapplicable to the vast majority of our traffic (logged in users are such a small percentage) to make proper comparisons. You could also see the impact of features on usage for average users.
We may want to consider at least putting some such scaffolding for beta->prod desktop modes into place before shifting to weekly deployments, although if that holds up this change significantly, I'd be in favor of making the shift first and then iterating.
Right now we have lots of individual "experimental" prefs, some dark launch URL parameters (&useNew=1 for the account creation forms etc.), some changes that are announced widely but then rolled out immediately (section edit link change), etc. What would be the disadvantage of having a single "I'd like the latest and greatest changes once they come in" preference for our users? The main disadvantage I see is that we'd need to temporarily retain two codepaths for significant user-facing changes, potentially increasing code complexity a fair bit, but perhaps reducing post-launch cost in return. And we'd need to consider more carefully if/when to make the beta/prod switch -- not necessarily a bad thing. ;-)
Have there been any negative experiences with this model on the mobile sites?
Erik
-- Erik Möller VP of Engineering and Product Development, Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Leslie Carr Wikimedia Foundation AS 14907, 43821 http://as14907.peeringdb.com/
<quote name="Leslie Carr" date="2013-05-07" time="11:43:47 -0700">
I think this is awesome for features ... but if we're putting work into this, I would love even more to have a clustered a+b production environment, such that 10% of folks are put on the new release (cluster a) and then it gets pushed over to cluster b. Then we can also test performance in a real world environment, and breakages only happen for 10% (PS the 10% number was pulled out of thin air).
I really like this idea, I think. So bringing various concurrent (email) threads together; long term we could have something that looks like this (roughly):
1) Change proposed - Jenkins runs tests on a throw away labs instance
2) Change merged to master - Jenkins/etc runs tests on betalabs
3) New wmfXX released to 10% of cluster - "10%" being something like: test, test2, mediawiki.org, and some of the non-'pedia project sites - Our users do the testing ;-)
4) That wmfXX goes to the rest of the cluster - Hopefully all is good.
Is that kind of what you had in mind, Leslie?
Greg
On 05/07/2013 03:02 PM, Greg Grossmeier wrote:
- New wmfXX released to 10% of cluster
- "10%" being something like: test, test2, mediawiki.org, and some of the non-'pedia project sites
- Our users do the testing ;-)
I was interpreting Leslie as saying 10% cross-cut throughout. I.E. 10% of German Wiktionary, 10% of Japanese Wikinews, 10% of English Wikipedia, 10% of everything.
That would mean we would really get wide testing before it rolled out to the 90%.
But I don't know which she meant.
Matt Flaschen
On 05/07/2013 04:46 PM, Matthew Flaschen wrote:
I was interpreting Leslie as saying 10% cross-cut throughout. I.E. 10% of German Wiktionary, 10% of Japanese Wikinews, 10% of English Wikipedia, 10% of everything.
That would mean we would really get wide testing before it rolled out to the 90%.
For this to work for user-facing changes, though, there would have to be a clear indicator of which you're on (maybe even an automatic bug report tool), or reports would be confusing.
Matt Flaschen
<quote name="Matthew Flaschen" date="2013-05-07" time="16:47:32 -0400">
On 05/07/2013 04:46 PM, Matthew Flaschen wrote:
I was interpreting Leslie as saying 10% cross-cut throughout. I.E. 10% of German Wiktionary, 10% of Japanese Wikinews, 10% of English Wikipedia, 10% of everything.
That would mean we would really get wide testing before it rolled out to the 90%.
For this to work for user-facing changes, though, there would have to be a clear indicator of which you're on (maybe even an automatic bug report tool), or reports would be confusing.
Yeah, that was my first interpretation, but I wasn't sure how it'd work in our situation, so I internally modified and then wrote out the steps I did :-).
Greg
Leslie Carr wrote:
I think this is awesome for features ... but if we're putting work into this, I would love even more to have a clustered a+b production environment, such that 10% of folks are put on the new release (cluster a) and then it gets pushed over to cluster b. Then we can also test performance in a real world environment, and breakages only happen for 10% (PS the 10% number was pulled out of thin air).
The opt-in beta is much too limited, as well as being inapplicable to the vast majority of our traffic (logged in users are such a small percentage) to make proper comparisons. You could also see the impact of features on usage for average users.
I thought the idea was to not annoy users with unpolished/beta features, unless they've opted in. This seems to be the rationale for mobile.
MZMcBride
wikitech-l@lists.wikimedia.org