Thanks for cc'ing me Jonathan, I wouldn't have seen this otherwise.
TL;DR - Objectively measurable criteria. Clear process. No surprises.
The context of my giving the example of Vector as a good example *of process* was after the presentation about the future of 'Flow' at Wikimania.[1] I highly recommend people read the slides of this session if you've not already - great stuff![2] In particular, I was talking about how the Usability Initiative team were the first to use an opt-in Beta process at the WMF. It was the use of iterative development, progressive rollout, and closed-loop feedback that made their work a successful *process*. I wasn't talking about the Vector skin per-se.
Significantly, they had a publicly-declared and measurable, criteria for determining what counted as "community acceptance/support". This criteria was 80% retention rate of opt-in users. They did not lock-down the features of one version of their beta and move to the next version until they could show that 80% of people who tried it, preferred it. Moreover, they stuck to this objective criteria for measuring consensus support all the way to the final rollout.[3]
This system was a great way to identify people who had the willingness to change but had concerns, as opposed to getting bogged down by people who would never willingly accept a change or people who would accept all changes regardless. It also meant that those people became 'community advocates' for the new system because they had positive experiences of their feedback being taken into account.
And I DO remember the process, and the significance that was attached to it by the team (which included Trevor Parscal), because in 2009 I interviewed the whole team in person for the Wikipedia Weekly podcast.[4] Far from "looking at the past through rose coloured glasses" I recall the specific pain-points on the day that the Vector Skin became the default. These were the inter-language links list being autocollapsed, and the Wikipedia logo was updated.[5] The fact that it was THESE things that caused all the controversy on the day that Vector went from Beta to opt-out is instructive. These were the two things that were NOT part of the Beta testing period - no process, surprises. Tthe people who had valid feedback had not been given an opportunity to provide it and valid feedback came instead in the form of swift criticism on mailing lists.[6]
My support for concept of a clearly defined, objectively measured, rollout *process* for new features is not new... When Fabrice announced "beta features" in November 2013 I was the first to respond - referring to the same examples, and telling the same story about the Usability Initiative's processes.[7]
Then, as now, the "beta features" tab lists the number of users who have opted-in to a tool, but there is no comparative/objective explanation of what that actually means! For example, it tells me that 33,418 people have opted-in to "Hovercards", but is that good? How long did it take to reach that level? How many people have switched it off? What proportion of the active editorship is that? And most importantly - what relationship does this number have to whether Hovercards will 'graduate' or 'fall' the opt-in Beta process?
Which brings me to the point I made to Jonathan, and also Pau, at Wikimania about the future of Flow.
If there's two things we Wikimedians hate most, I've come to believe that they are:
1) The absence of a clear process, or a failure to follow that process
2) Being surprised
We can, generally, abide outcomes/decisions that we don't like (e.g. article-deletion debates) as long as the process by which that decision was arrived at was clearly explained, and objectively followed. I believe this is why there was so much anger and frustration about the 'autoconfirm article creation trial' on en.wp [8] and the 'superprotect' controversy - because they represented a failure to follow a process, and a surprise (respectively).
So, even more than the Vector skin or even the Visual Editor, Flow ABSOLUTELY MUST have a clear, objectively measurable, *process* for measuring community consensus because it will be replacing community-designed and community-operated workflows (e.g. [9]). This means that once it is enabled on a particular workflow:
1) an individual user can't opt-out to the old system.
2) it will most affect, and be most used by, admins and other very-active-users.
Therefore, I believe that this development must be an iterative process of working on 1 workflow on 1 wiki at a time, with objective measures of consensus-support that are at least partially *determined by the affected community itself*. This will be the only way that Flow can gain community consensus for replacing the existing template/sub-page/gadget/transclusion/category-based workflows.[10]
Because Flow will be updating admin-centric workflows, if it is rolled-out in a way that is anything less than this then it will strike the community as hubris - "it is necessary to destroy the town in order to save it".[11]
-Liam / Wittylama
P.S. While you're at it please make ALL new features go through the "Beta features" process with some consistent/discoverable process. As it is, some things live there permanently in limbo, some things DO have a process associated with them, and some things bypass the beta system altogether. As bawolff said, this means people feel they don't have any influence over the rollout process and therefore chose to not be involved at all.[12]