Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

Edward Saperia

Conference Director Wikimania London

email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes <okeyes@wikimedia.org> wrote:

Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. Then, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia <ed@wikimanialondon.org> wrote:

You mean, you don't have them yourselves, or you can't expose them?

Edward Saperia
Conference Director Wikimania London
email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes <okeyes@wikimedia.org> wrote:

Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia <ed@wikimanialondon.org> wrote:

There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

Edward Saperia
Conference Director Wikimania London
email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee