Recommending things to edit

List overview All Threads
Download

newer

older

moderation in Flow

Re: [EE] [Wikitech-l] Is...

Steven Walling

26 Aug 2014 26 Aug '14

12:22 a.m.

Hi all,

I'd love to get some feedback on a new Growth team project: task recommendations for Wikipedia editors. The design specification and background for this is at https://www.mediawiki.org/wiki/Task_recommendations. I also gave a brief introduction to this at the last Foundation Metrics & Activities meeting, viewable at https://www.youtube.com/watch?v=2JbZ1uWoKEg#t=3483

The two prototype recommendation systems are live now on Beta Labs ( en.wikipedia.beta.wmflabs.org). If you edit copies of real articles (like Dog, Cat, or Cheese) you'll get some good results. However, this replica of English Wikipedia is a bit slow, so be patient with us.

The next step for this project is to A/B test this with newly-registered users on Wikipedia. Since translations of the interface have been pretty quick (thank you translators!), we'll likely A/B test in at least English, German, and French, if not other languages too. We've done some usability testing, including at Wikimania, but we need to run a randomized experiment to give us a first look at whether these recommendations can have a positive impact on new editor productivity and retention.

Right now, the main goal of the recommendations we've built is to get someone who's made their first few edits to keep going. That's why it only looks at the last article you edited, and makes recommendations off of that. In the future, we might consider doing something more sophisticated, such as combing through your entire edit history to recommend articles. We hope that we can build something which can be continually useful for a content contributor as they get more experienced.

Thanks,

-- Steven Walling, Product Manager https://wikimediafoundation.org/

Attachments:

attachment.htm (text/html — 2.2 KB)

Show replies by date

svetlana

26 Aug 26 Aug

1:08 a.m.

Hi!

Steven Walling wrote:

...

I'd love to get some feedback on a new Growth team project: task recommendations for Wikipedia editors. The design specification and background for this is at https://www.mediawiki.org/wiki/Task_recommendations.

I'd like to see this for anonymous contributors -- for the post-edit it should be easy, for the flyout version it needs some thinking on possible implementation.

Steven Walling wrote:

...

The next step for this project is to A/B test this with newly-registered users on Wikipedia.

Please (kicking and screaming ;-): - Ask at a local village pump and get community approval for enabling this extension. - Also test this at other Wikimedia projects, not only Wikipedia.

You'll find a lot of useful feedback from community and appreciation of your work.

svetlana

Steven Walling

1:49 a.m.

On Mon, Aug 25, 2014 at 4:08 PM, svetlana svetlana@fastmail.com.au wrote:

...

I'd like to see this for anonymous contributors -- for the post-edit it should be easy, for the flyout version it needs some thinking on possible implementation.

Ultimately whether we present recommendations to anonymous editors will depend on how the recommendations are generated. Right now, both the post-edit and flyout versions are based off your last edit, so they theoretically could work for any editor. Like you hinted at, the flyout is harder since you don't know if the person has actually edited before (or if it was someone else from that IP).

Any kind of filtering that looks at something more sophisticated would likely require have more persistent history about an editor, and thus might not work for unregistered users easily. Right now, for example, we already know that SuggestBot has a lot of success by combing through a user's entire edit history.

Our strategy will be first to figure out what works well for making good recommendations, then see if we can extend it to unregistered users if possible. Prioritizing recommendations for registered users makes sense not just because it's technologically easier, but because we know registered editors make the bulk of contributions to Wikipedia. Providing personalized functionality to only registered users is something that's a pretty common pattern on wikis, whether it's notifications, watchlist, or other things.

...

Steven Walling wrote:

...
The next step for this project is to A/B test this with newly-registered users on Wikipedia.

Please (kicking and screaming ;-):

Ask at a local village pump and get community approval for enabling this

extension.

Also test this at other Wikimedia projects, not only Wikipedia.

We generally don't ask permission to enable extensions. Thats a technical decision that gets made as part of the deployment cycle. Plus, recommendations is not a new extension, but is an optional feature of an extension that has been deployed to many Wikipedias for a long time (GettingStarted). We'll simply turn it on for a short time as part of a test, then turn it off while we analyze the results.

The primary purpose of the new functionality is to aid new editors, and it won't be presented to any existing registered users (not even on an opt-in basis). There's no point in polling existing community members about functionality they will not see. Running a short A/B test, in concert with usability testing, will provide us with an objective look at whether a particular feature helps new people contribute to the encyclopedia more or less.

As for enabling recommendations on other Wikimedia projects... we have no idea whether the recommendations will work for *any* project. Testing on Wikipedia is our first focus. From a practical standpoint, the size of large Wikipedias let's us run a comparatively short test to tell us statistically significant results. If it ends up being a success, then we should talk about whether the recommendations will work for non-encyclopedic projects as well. It would definitely be cool to have recommendations for editor communities like Wikidata, Wikivoyage, and Wiktionary too.

-- Steven Walling, Product Manager https://wikimediafoundation.org/

svetlana

3:49 a.m.

Steven Walling wrote:

...

Prioritizing recommendations for registered users makes sense not just because it's technologically easier, but because we know registered editors make the bulk of contributions to Wikipedia.

...

From my impression regarding main namespace, 80% of edits by number are made by unregistered contributors, of which 80% are vandalism. The latter doesn't justify ignoring the former, perhaps, though - I'd be very interested in making more features available to unregistered contributors where doing so is not too much effort. https://meta.wikimedia.org/wiki/Musings_about_unregistered_contributors contains detail on this matter (which, from what I could see, are best addressed by the Growth team).

Steven Walling wrote:

...

We generally don't ask permission to enable extensions. Thats a technical decision that gets made as part of the deployment cycle.

Nobody prevents you from asking, though, if you like. The Multimedia Team tries to start doing so. :-) However, could you please write up a note, once a week perhaps, which I could distribute to affected Wikimedia projects - with notes on planned releases, new feature additions, and how to test them? I'd like to talk to the people myself, perhaps, if you wouldn't like to waste your time: following the release plans is not very easy right now, as you're sending them out to mailing lists and there's no centralised releases and big changes calendar thing for Wikimedia Engineering Teams.

Steven Walling wrote:

...

Plus, recommendations is not a new extension, but is an optional feature of an extension that has been deployed to many Wikipedias for a long time (GettingStarted). We'll simply turn it on for a short time as part of a test, then turn it off while we analyze the results.

Ah. It'd be nice to have people know about the wonderful work you're doing; more on this in the above paragraph.

Steven Walling wrote:

...

The primary purpose of the new functionality is to aid new editors, and it won't be presented to any existing registered users (not even on an opt-in basis). There's no point in polling existing community members about functionality they will not see. Running a short A/B test, in concert with usability testing, will provide us with an objective look at whether a particular feature helps new people contribute to the encyclopedia more or less.

I imagine some people clearly remember how they got started and what helped and what didn't. Would be interested to collect thoughts on that and show them this ongoing work.

Steven Walling wrote:

...

As for enabling recommendations on other Wikimedia projects... we have no idea whether the recommendations will work for *any* project. Testing on Wikipedia is our first focus. From a practical standpoint, the size of large Wikipedias let's us run a comparatively short test to tell us statistically significant results. If it ends up being a success, then we should talk about whether the recommendations will work for non-encyclopedic projects as well. It would definitely be cool to have recommendations for editor communities like Wikidata, Wikivoyage, and Wiktionary too.

Good! :-)

By the way, is it possible to limit the suggestions to a certain category (or exclude a category)? Some projects appear to archive their articles and suggesting to edit them would make no sense, even if the topic is very similar.

svetlana

Matthew Flaschen

6:32 a.m.

On 08/25/2014 09:49 PM, svetlana wrote:

...

However, could you please write up a note, once a week perhaps, which I could distribute to affected Wikimedia projects - with notes on planned releases, new feature additions, and how to test them? I'd like to talk to the people myself, perhaps, if you wouldn't like to waste your time: following the release plans is not very easy right now, as you're sending them out to mailing lists and there's no centralised releases and big changes calendar thing for Wikimedia Engineering Teams.

I know you're not a big fan of mailing lists. You may find https://meta.wikimedia.org/wiki/Tech/News/Latest more to your liking. It's an on-wiki newsletter about feature developments and technical work. I think that's the closest thing to what you're seeking.

Matt Flaschen

Matthew Flaschen

6:26 a.m.

On 08/25/2014 07:49 PM, Steven Walling wrote:

...

Any kind of filtering that looks at something more sophisticated would likely require have more persistent history about an editor, and thus might not work for unregistered users easily. Right now, for example, we already know that SuggestBot has a lot of success by combing through a user's entire edit history.

If we're okay with just using the IP's contribution history, we have the same data. The issue is just if we're okay presenting it to the IP. We present the actual page (e.g. https://en.wikipedia.org/wiki/Special:Contributions/71.175.130.30), even though it may be more than one person. The question is whether it's alright to present an aggregate of that.

We should remember some IPs are static and persistent, though probably a minority.

...

Testing on Wikipedia is our first focus. From a practical standpoint,

the size

...

of large Wikipedias let's us run a comparatively short test to tell us statistically significant results. If it ends up being a success, then we should talk about whether the recommendations will work for non-encyclopedic projects as well. It would definitely be cool to have recommendations for editor communities like Wikidata, Wikivoyage, and Wiktionary too.

I do think this is a good potential candidate for other projects, though, particularly Wikivoyage (since it has a similar content model and no special archiving behavior that I know of).

Matt Flaschen

Edward Saperia

4:34 p.m.

...

There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

Oliver Keyes

4:46 p.m.

Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...

...
There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

Edward Saperia

4:52 p.m.

You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

...

Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...
...
There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

Oliver Keyes

9:03 p.m.

Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

...

You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

...
Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...
...
There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

Oliver Keyes

9:03 p.m.

Bah; alternative mechanism for running this experiment would be something observational.

On 26 August 2014 15:03, Oliver Keyes okeyes@wikimedia.org wrote:

...

Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

...
You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

...
Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...
...
There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

-- Oliver Keyes Research Analyst Wikimedia Foundation

Edward Saperia

1 Sep 1 Sep

5:16 p.m.

Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

On 26 August 2014 20:03, Oliver Keyes okeyes@wikimedia.org wrote:

...

Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

...
You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

...
Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...
...
There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

Jonathan Morgan

3 Sep 3 Sep

1:17 a.m.

I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now *Analytics, *on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday https://meta.wikimedia.org/wiki/Grants:IEG. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers, Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia ed@wikimanialondon.org wrote:

...

Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes okeyes@wikimedia.org wrote:

...
Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

...
You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

...
Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...
...
There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org

Oliver Keyes

8:15 a.m.

+1, J-Mo and Ed. I want this to be done. I want analytics to be somewhere where the primary job isn't producing research (although we all love producing research) but creating an environment in which everyone (including us) can produce research. But (1) I don't speak for analytics engineering and (2) at the moment we've got a big deficit to catch up on. We're actively looking at things like dataset releases and tool development, but right now we have to get those datasets and work out what tools fit on top of them. Analytics Engineering has Vital Signs, visualisation and Hadoop - Research and Data has reader and editor metric definitions. But we're getting there :).

On 2 September 2014 19:17, Jonathan Morgan jmorgan@wikimedia.org wrote:

...

I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now *Analytics, *on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday https://meta.wikimedia.org/wiki/Grants:IEG. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers, Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia ed@wikimanialondon.org wrote:

...
Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes okeyes@wikimedia.org wrote:

...
Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

...
You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

...
Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...
> There's no point in polling existing community members about > functionality they will not see. >

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

Edward Saperia

3:41 p.m.

In a movement like this with a lot of very active, very leveraged community activity, it seems to me that we should *always* be trying to make things that are infrastructure instead of closed products.

cc Halfak - one of the few talks I managed to attend at Wikimania was his talk on "Research as Infrastructure", which I thought made the case very well.

On 3 September 2014 00:17, Jonathan Morgan jmorgan@wikimedia.org wrote:

...

I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now *Analytics, *on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday https://meta.wikimedia.org/wiki/Grants:IEG. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers, Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia ed@wikimanialondon.org wrote:

...
Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes okeyes@wikimedia.org wrote:

...
Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

...
You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

...
Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org wrote:

...
> There's no point in polling existing community members about > functionality they will not see. >

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org/ email ed@wikimanialondon.org • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

Oliver Keyes

4:18 p.m.

Sure: point me to something I mentioned that's a closed product and isn't a prerequisite for an open one? :p

On Wednesday, 3 September 2014, Edward Saperia ed@wikimanialondon.org wrote:

...

In a movement like this with a lot of very active, very leveraged community activity, it seems to me that we should *always* be trying to make things that are infrastructure instead of closed products.

cc Halfak - one of the few talks I managed to attend at Wikimania was his talk on "Research as Infrastructure", which I thought made the case very well.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email javascript:_e(%7B%7D,'cvml','ed@wikimanialondon.org'); • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 3 September 2014 00:17, Jonathan Morgan <jmorgan@wikimedia.org javascript:_e(%7B%7D,'cvml','jmorgan@wikimedia.org');> wrote:

...
I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now *Analytics, *on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday https://meta.wikimedia.org/wiki/Grants:IEG. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers, Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia <ed@wikimanialondon.org javascript:_e(%7B%7D,'cvml','ed@wikimanialondon.org');> wrote:

...
Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email javascript:_e(%7B%7D,'cvml','ed@wikimanialondon.org'); • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes <okeyes@wikimedia.org javascript:_e(%7B%7D,'cvml','okeyes@wikimedia.org');> wrote:

...
Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia <ed@wikimanialondon.org javascript:_e(%7B%7D,'cvml','ed@wikimanialondon.org');> wrote:

...
You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email javascript:_e(%7B%7D,'cvml','ed@wikimanialondon.org'); • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes <okeyes@wikimedia.org javascript:_e(%7B%7D,'cvml','okeyes@wikimedia.org');> wrote:

...
Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia <ed@wikimanialondon.org javascript:_e(%7B%7D,'cvml','ed@wikimanialondon.org');> wrote:

> > >> There's no point in polling existing community members about >> functionality they will not see. >> > > While I am a great supporter of your team's work, I'd just like to > comment on the above; > > Wiser community members are aware that they are part of a powerful > ecosystem, and that taming this ecosystem is a far more leveraged pursuit > than doing the work yourself. Creating additional endpoints for onboarding > processes that you're exposing to new users should be something that all > projects are excited to take part in, so hopefully you'd want to poll the > community for the valuable "Yes, and..." responses you'll get. > > If you find you don't get responses like this, perhaps you might > want to consider re-framing your new functionality as open infrastructure > that the rest of the community is invited to build on, for example maybe > wikiprojects themselves could specify the suggestions that are shown to new > editors who edit in their subject areas? > > Given appropriate tools to track effectiveness, this could create a > huge, open environment for experimentation that could find interesting > solutions faster than any engineering department ever could on their own. > > *Edward Saperia* > Conference Director Wikimania London > http://www.wikimanialondon.org/ > email javascript:_e(%7B%7D,'cvml','ed@wikimanialondon.org'); • > facebook http://www.facebook.com/edsaperia • twitter > http://www.twitter.com/edsaperia • 07796955572 > 133-135 Bethnal Green Road, E2 7DG > > _______________________________________________ > EE mailing list > EE@lists.wikimedia.org > javascript:_e(%7B%7D,'cvml','EE@lists.wikimedia.org'); > https://lists.wikimedia.org/mailman/listinfo/ee > >

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','EE@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','EE@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','EE@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','EE@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/ee

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org javascript:_e(%7B%7D,'cvml','jmorgan@wikimedia.org');

EE mailing list EE@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','EE@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/ee

-- Sent from a portable device of Lovecraftian complexity.

Edward Saperia

4:25 p.m.

Is this intended to be an open piece of infrastructure that anyone can edit? https://www.mediawiki.org/wiki/Task_recommendations

Of course, you can say anything closed is something that just hasn't been made open yet, but that's exactly why I raise the issue.

On 3 September 2014 15:18, Oliver Keyes okeyes@wikimedia.org wrote:

...

Sure: point me to something I mentioned that's a closed product and isn't a prerequisite for an open one? :p

On Wednesday, 3 September 2014, Edward Saperia ed@wikimanialondon.org wrote:

...
In a movement like this with a lot of very active, very leveraged community activity, it seems to me that we should *always* be trying to make things that are infrastructure instead of closed products.

cc Halfak - one of the few talks I managed to attend at Wikimania was his talk on "Research as Infrastructure", which I thought made the case very well.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 3 September 2014 00:17, Jonathan Morgan jmorgan@wikimedia.org wrote:

...
I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now *Analytics, *on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday https://meta.wikimedia.org/wiki/Grants:IEG. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers, Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia ed@wikimanialondon.org wrote:

...
Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes okeyes@wikimedia.org wrote:

...
Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

...
You mean, you don't have them yourselves, or you can't expose them?

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote:

> Except we don't have those tools. There are a lot of domains in the > ecosystem where this kind of experimentation and targeting on a per-wiki or > per-project basis, but we have a big gap around functionality and expertise > to let us scientifically test the efficacy of their various implementations. > > > On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org > wrote: > >> >> >>> There's no point in polling existing community members about >>> functionality they will not see. >>> >> >> While I am a great supporter of your team's work, I'd just like to >> comment on the above; >> >> Wiser community members are aware that they are part of a powerful >> ecosystem, and that taming this ecosystem is a far more leveraged pursuit >> than doing the work yourself. Creating additional endpoints for onboarding >> processes that you're exposing to new users should be something that all >> projects are excited to take part in, so hopefully you'd want to poll the >> community for the valuable "Yes, and..." responses you'll get. >> >> If you find you don't get responses like this, perhaps you might >> want to consider re-framing your new functionality as open infrastructure >> that the rest of the community is invited to build on, for example maybe >> wikiprojects themselves could specify the suggestions that are shown to new >> editors who edit in their subject areas? >> >> Given appropriate tools to track effectiveness, this could create a >> huge, open environment for experimentation that could find interesting >> solutions faster than any engineering department ever could on their own. >> >> *Edward Saperia* >> Conference Director Wikimania London >> http://www.wikimanialondon.org/ >> email • facebook http://www.facebook.com/edsaperia • twitter >> http://www.twitter.com/edsaperia • 07796955572 >> 133-135 Bethnal Green Road, E2 7DG >> >> _______________________________________________ >> EE mailing list >> EE@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/ee >> >> > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation > > _______________________________________________ > EE mailing list > EE@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/ee > >

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Sent from a portable device of Lovecraftian complexity.

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

Oliver Keyes

4:33 p.m.

No idea! I was talking about research and analytics tools. Research is providing support on the algorithms but the actual development is the E3 team, the openness of which Steven has already commented on above.

On 3 September 2014 10:25, Edward Saperia ed@wikimanialondon.org wrote:

...

Is this intended to be an open piece of infrastructure that anyone can edit? https://www.mediawiki.org/wiki/Task_recommendations

Of course, you can say anything closed is something that just hasn't been made open yet, but that's exactly why I raise the issue.

On 3 September 2014 15:18, Oliver Keyes okeyes@wikimedia.org wrote:

...
Sure: point me to something I mentioned that's a closed product and isn't a prerequisite for an open one? :p

On Wednesday, 3 September 2014, Edward Saperia ed@wikimanialondon.org wrote:

...
In a movement like this with a lot of very active, very leveraged community activity, it seems to me that we should *always* be trying to make things that are infrastructure instead of closed products.

cc Halfak - one of the few talks I managed to attend at Wikimania was his talk on "Research as Infrastructure", which I thought made the case very well.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 3 September 2014 00:17, Jonathan Morgan jmorgan@wikimedia.org wrote:

...
I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now *Analytics, *on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday https://meta.wikimedia.org/wiki/Grants:IEG. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers, Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia ed@wikimanialondon.org wrote:

...
Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes okeyes@wikimedia.org wrote:

...
Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. *Then*, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org wrote:

> You mean, you don't have them yourselves, or you can't expose them? > > *Edward Saperia* > Conference Director Wikimania London > http://www.wikimanialondon.org > email • facebook http://www.facebook.com/edsaperia • twitter > http://www.twitter.com/edsaperia • 07796955572 > 133-135 Bethnal Green Road, E2 7DG > > > On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote: > >> Except we don't have those tools. There are a lot of domains in the >> ecosystem where this kind of experimentation and targeting on a per-wiki or >> per-project basis, but we have a big gap around functionality and expertise >> to let us scientifically test the efficacy of their various implementations. >> >> >> On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org >> wrote: >> >>> >>> >>>> There's no point in polling existing community members about >>>> functionality they will not see. >>>> >>> >>> While I am a great supporter of your team's work, I'd just like to >>> comment on the above; >>> >>> Wiser community members are aware that they are part of a powerful >>> ecosystem, and that taming this ecosystem is a far more leveraged pursuit >>> than doing the work yourself. Creating additional endpoints for onboarding >>> processes that you're exposing to new users should be something that all >>> projects are excited to take part in, so hopefully you'd want to poll the >>> community for the valuable "Yes, and..." responses you'll get. >>> >>> If you find you don't get responses like this, perhaps you might >>> want to consider re-framing your new functionality as open infrastructure >>> that the rest of the community is invited to build on, for example maybe >>> wikiprojects themselves could specify the suggestions that are shown to new >>> editors who edit in their subject areas? >>> >>> Given appropriate tools to track effectiveness, this could create >>> a huge, open environment for experimentation that could find interesting >>> solutions faster than any engineering department ever could on their own. >>> >>> *Edward Saperia* >>> Conference Director Wikimania London >>> http://www.wikimanialondon.org/ >>> email • facebook http://www.facebook.com/edsaperia • twitter >>> http://www.twitter.com/edsaperia • 07796955572 >>> 133-135 Bethnal Green Road, E2 7DG >>> >>> _______________________________________________ >>> EE mailing list >>> EE@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/ee >>> >>> >> >> >> -- >> Oliver Keyes >> Research Analyst >> Wikimedia Foundation >> >> _______________________________________________ >> EE mailing list >> EE@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/ee >> >> > > _______________________________________________ > EE mailing list > EE@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/ee > >

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Sent from a portable device of Lovecraftian complexity.

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

Aaron Halfaker

5:10 p.m.

Both https://www.mediawiki.org/wiki/Task_recommendations and https://meta.wikimedia.org/wiki/Research:Task_recommendations are open to edits and discussion.

As for community involvement, most of the studies I run through my work on the Growth team have some level of volunteer involvement. See https://meta.wikimedia.org/wiki/Research:Wikipedia_article_creation as an example. I actually iterated on some analysis on the talk page with those who showed up. I welcome more involvement. If you have an RQ, I want to make it testable and add it to the list.

As far as building infrastructure, in the case of the Growth team, I disagree. I think it is important that those who develop the shiney products are free to experiment without building infrastructure we might never use. In product, we're working to build theory about the effect of feature interventions and the ability to do that quick and dirty is important. It's towards the end of the experimentation cycle that we ought to consider making the technologies we have built more open to iteration, but even then, it should be primarily about delivering effective tools. Here, I'd like to see more participatory design. In this particular case, I pulled in User:Nettrom (maintainer of User:SuggestBot) since I figured his experience delivering personalized recommendations to Wikipedians would be critical. I'd be interested in re-hashing the discussions about what to build first with anyone who is interested. I've done so recently with Svetlana and advocated the expansion to anons in future iterations because of her activity in #wikimedia-growth.

Now, on the analytics side of the world, we're all about infrastructure. I've been working with the search team to make sure that the functions we use in CirrusSearch are commonly available. (Really, they do all of the work. I just talk to them.) Right now, you can do the exact same things that we're building into the task recommendation interface with Wikipedia's API. Try http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=... for example. This will get you a list of topicly similar articles to "Anarchism". We (the analytics team) also have an ongoing initiative to bring more data to the public labsDB instances so that it will be easier for non-WMF staff to work with. See https://meta.wikimedia.org/wiki/Schema:TaskRecommendation for one of the events that we're logging in the growth experiment. I'd like to make some or all of this data public to enable "community analytics". Right now, there's technical and political hurdles we're working past.

When you consider "community experimentation" infrastructure, you should also think of https://meta.wikimedia.org/wiki/Grants:IEG.

-Aaron

On Wed, Sep 3, 2014 at 4:33 PM, Oliver Keyes okeyes@wikimedia.org wrote:

...

No idea! I was talking about research and analytics tools. Research is providing support on the algorithms but the actual development is the E3 team, the openness of which Steven has already commented on above.

On 3 September 2014 10:25, Edward Saperia ed@wikimanialondon.org wrote:

...
Is this intended to be an open piece of infrastructure that anyone can edit? https://www.mediawiki.org/wiki/Task_recommendations

Of course, you can say anything closed is something that just hasn't been made open yet, but that's exactly why I raise the issue.

On 3 September 2014 15:18, Oliver Keyes okeyes@wikimedia.org wrote:

...
Sure: point me to something I mentioned that's a closed product and isn't a prerequisite for an open one? :p

On Wednesday, 3 September 2014, Edward Saperia ed@wikimanialondon.org wrote:

...
In a movement like this with a lot of very active, very leveraged community activity, it seems to me that we should *always* be trying to make things that are infrastructure instead of closed products.

cc Halfak - one of the few talks I managed to attend at Wikimania was his talk on "Research as Infrastructure", which I thought made the case very well.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 3 September 2014 00:17, Jonathan Morgan jmorgan@wikimedia.org wrote:

...
I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now *Analytics, *on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday https://meta.wikimedia.org/wiki/Grants:IEG. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers, Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia <ed@wikimanialondon.org

...
wrote:

...
Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

*Edward Saperia* Conference Director Wikimania London http://www.wikimanialondon.org email • facebook http://www.facebook.com/edsaperia • twitter http://www.twitter.com/edsaperia • 07796955572 133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes okeyes@wikimedia.org wrote:

> Neither; the tools we have for running experiments are largely > hand-build on an ad hoc basis. For data collection we have tools like > eventlogging, although they require developer energy to integrate with > [potential area of experimentation]. But for actually analysing the results > it looks very different. > > Let's use a couple of concrete examples: suppose we wanted to look > at whether there was a statistically significant variation in whether or > not people edited if we included a contributor tagline, versus didn't. We'd > need to take the same set of pages, ideally, and run a controlled study > around an A/B test. > > So first we'd display one version of the site for 50% of the > population and another for the other 50% (realistically we'd probably use > smaller sets and give the vast majority of editors the default experience, > but it's a hypothetical, so let's run with it). That would require > developer energy. Then we'd set up some kind of logging to pipe back edit > attempts and view attempts by [control sample/not control sample]. Also > developer energy, although much less. *Then*, crucially, we'd have > to actually do the analysis, which is not something that can be robustly > generalised. > > In this example we'd be looking for significance, so we'd be looking > at using some kind of statistical hypothesis test. Those vary depending on > what probability distributions the underlying population follows. So we > need to work out what probability distribution is most appropriate, and > then apply the test most appropriate to that distribution. And that's not > something that can be automated through software. As a result, we get the > data and then work out how to test for significance. > > The alternate hypothesis would be something observational; you make > the change and then compare the behaviour of people while the change is > live to their behaviour before and after. This cuts out most of the > developer cost but doesn't do anything for the research support or the > ad-hoc code and tools that need to come with it. > > > On 26 August 2014 10:52, Edward Saperia ed@wikimanialondon.org > wrote: > >> You mean, you don't have them yourselves, or you can't expose them? >> >> *Edward Saperia* >> Conference Director Wikimania London >> http://www.wikimanialondon.org >> email • facebook http://www.facebook.com/edsaperia • twitter >> http://www.twitter.com/edsaperia • 07796955572 >> 133-135 Bethnal Green Road, E2 7DG >> >> >> On 26 August 2014 15:46, Oliver Keyes okeyes@wikimedia.org wrote: >> >>> Except we don't have those tools. There are a lot of domains in >>> the ecosystem where this kind of experimentation and targeting on a >>> per-wiki or per-project basis, but we have a big gap around functionality >>> and expertise to let us scientifically test the efficacy of their various >>> implementations. >>> >>> >>> On 26 August 2014 10:34, Edward Saperia ed@wikimanialondon.org >>> wrote: >>> >>>> >>>> >>>>> There's no point in polling existing community members about >>>>> functionality they will not see. >>>>> >>>> >>>> While I am a great supporter of your team's work, I'd just like >>>> to comment on the above; >>>> >>>> Wiser community members are aware that they are part of a >>>> powerful ecosystem, and that taming this ecosystem is a far more leveraged >>>> pursuit than doing the work yourself. Creating additional endpoints for >>>> onboarding processes that you're exposing to new users should be something >>>> that all projects are excited to take part in, so hopefully you'd want to >>>> poll the community for the valuable "Yes, and..." responses you'll get. >>>> >>>> If you find you don't get responses like this, perhaps you might >>>> want to consider re-framing your new functionality as open infrastructure >>>> that the rest of the community is invited to build on, for example maybe >>>> wikiprojects themselves could specify the suggestions that are shown to new >>>> editors who edit in their subject areas? >>>> >>>> Given appropriate tools to track effectiveness, this could create >>>> a huge, open environment for experimentation that could find interesting >>>> solutions faster than any engineering department ever could on their own. >>>> >>>> *Edward Saperia* >>>> Conference Director Wikimania London >>>> http://www.wikimanialondon.org/ >>>> email • facebook http://www.facebook.com/edsaperia • twitter >>>> http://www.twitter.com/edsaperia • 07796955572 >>>> 133-135 Bethnal Green Road, E2 7DG >>>> >>>> _______________________________________________ >>>> EE mailing list >>>> EE@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/ee >>>> >>>> >>> >>> >>> -- >>> Oliver Keyes >>> Research Analyst >>> Wikimedia Foundation >>> >>> _______________________________________________ >>> EE mailing list >>> EE@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/ee >>> >>> >> >> _______________________________________________ >> EE mailing list >> EE@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/ee >> >> > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation > > _______________________________________________ > EE mailing list > EE@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/ee > >

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Jonathan T. Morgan Learning Strategist Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) jmorgan@wikimedia.org

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Sent from a portable device of Lovecraftian complexity.

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

-- Oliver Keyes Research Analyst Wikimedia Foundation

EE mailing list EE@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/ee

Steven Walling

27 Aug 27 Aug

8:17 p.m.

On Tue, Aug 26, 2014 at 7:34 AM, Edward Saperia ed@wikimanialondon.org wrote:

...

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

*Edward Saperia*

Hi Ed,

Thanks for your thoughtful comments. What Oliver said about "appropriate tools to track effectiveness" is correct. Additionally though, I should note that what we're building in Growth is already an open platform. Just a very experimental and not particularly well-documented one. ;-)

The very simple recommendations engine we've built is a part of the public MediaWiki API.[1] If there are WikiProject members or others in the community with the technical chops and will to extend what we're doing for different use cases, we'd be happy to help with advice and code review.

In the meantime, however, our team's goal is to find the right kind of editor for Wikipedia, get them contributing, and get them to stick around.[2] We're going to do whatever it takes to make that happen.

1. See the 'gettingstartedgetpages' portion of https://en.wikipedia.org/w/api.php 2. https://www.mediawiki.org/wiki/Growth/2014-15_Goals

-- Steven Walling, Product Manager https://wikimediafoundation.org/

Steven Walling

11:53 p.m.

On Mon, Aug 25, 2014 at 3:22 PM, Steven Walling swalling@wikimedia.org wrote:

...

We've done some usability testing, including at Wikimania

For those interested, I just posted a summary of the test results from Wikimania.

You can read them at https://www.mediawiki.org/wiki/Task_recommendations/Usability_testing where we'll also post other usability test results.

-- Steven Walling, Product Manager https://wikimediafoundation.org/

Steven Walling

5 Sep 5 Sep

3:14 a.m.

On Mon, Aug 25, 2014 at 3:22 PM, Steven Walling swalling@wikimedia.org wrote:

...

The next step for this project is to A/B test this with newly-registered users on Wikipedia. Since translations of the interface have been pretty quick (thank you translators!), we'll likely A/B test in at least English, German, and French, if not other languages too.

Hi all, quick update on the upcoming test...

The good news: the main interfaces are well translated enough for us to launch our A/B test in 12 languages! This is the biggest variety of languages we'll ever have run an experiment in, as far as I know. The list is...

- English - German - French - Spanish - Italian - Russian - Chinese - Ukrainian - Swedish - Dutch - Hebrew

The bad news: we needed to add some error messages for the rare occasion that something goes wrong. These are *almost *merged and will ideally also get translated too. If you can help out on translatewiki.net or know someone who can, the new messages to translate will be in the GettingStarted extension.

Thanks!

-- Steven Walling, Product Manager https://wikimediafoundation.org/

3735

Age (days ago)

3746

Last active (days ago)

ee@lists.wikimedia.org

21 comments

7 participants

tags (0)

participants (7)

Aaron Halfaker
Edward Saperia
Jonathan Morgan
Matthew Flaschen
Oliver Keyes
Steven Walling
svetlana