On Tue, Jul 23, 2013 at 12:19 AM, Risker risker.wp@gmail.com wrote:
You pretty much had one chance at A/B testing, and it's done now. You can't repeat the tests as long as VE is the default editor.
That's not correct at all. It's still entirely possible to deliver different editing environments to randomized sets of new users, through the magic of software. We should be replicating a similar A/B test of VE again in my opinion.
This kind of testing isn't easy the first time you do it. What was supposed to be a week-long test (the usual minimum amount of time we look at editor-related features) had to be pared down to just three days of data, which is unfortunate but not entirely unexpected considering we had never done this kind of data collection with VE before.
Three days of data produced from a period where, as Erik noted, there were major errors with the browser blacklist and other issues likely means that the negative results were due to VE simply being buggy pre-launch in June. Aaron says this in his draft conclusions: "As mentioned in the discussion of Quantity of contribution, several known and unknown VisualEditor bugs may have prevented newcomers from saving changes to articles. The decreased probability of successfully saving an edit discussed above could be the result of such bugs."[1] In the meantime, the VE team has responded by fixing numerous bugs in the month following.
If you want to understand what the test results suggest, particularly regarding the future steps in evaluating VE from a quantitative standpoint should be, I believe Aaron is working on suggestions for further testing.
Steven
1. https://meta.wikimedia.org/wiki/Research:VisualEditor%27s_effect_on_newly_re...