Both https://www.mediawiki.org/wiki/Task_recommendations and https://meta.wikimedia.org/wiki/Research:Task_recommendations are open to edits and discussion.

As for community involvement, most of the studies I run through my work on the Growth team have some level of volunteer involvement. See https://meta.wikimedia.org/wiki/Research:Wikipedia_article_creation as an example. I actually iterated on some analysis on the talk page with those who showed up. I welcome more involvement. If you have an RQ, I want to make it testable and add it to the list.

As far as building infrastructure, in the case of the Growth team, I disagree. I think it is important that those who develop the shiney products are free to experiment without building infrastructure we might never use. In product, we're working to build theory about the effect of feature interventions and the ability to do that quick and dirty is important. It's towards the end of the experimentation cycle that we ought to consider making the technologies we have built more open to iteration, but even then, it should be primarily about delivering effective tools. Here, I'd like to see more participatory design. In this particular case, I pulled in User:Nettrom (maintainer of User:SuggestBot) since I figured his experience delivering personalized recommendations to Wikipedians would be critical. I'd be interested in re-hashing the discussions about what to build first with anyone who is interested. I've done so recently with Svetlana and advocated the expansion to anons in future iterations because of her activity in #wikimedia-growth.

Now, on the analytics side of the world, we're all about infrastructure. I've been working with the search team to make sure that the functions we use in CirrusSearch are commonly available. (Really, they do all of the work. I just talk to them.) Right now, you can do the exact same things that we're building into the task recommendation interface with Wikipedia's API. Try http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=morelike:Anarchism&srbackend=CirrusSearch&format=jsonfm for example. This will get you a list of topicly similar articles to "Anarchism". We (the analytics team) also have an ongoing initiative to bring more data to the public labsDB instances so that it will be easier for non-WMF staff to work with. See https://meta.wikimedia.org/wiki/Schema:TaskRecommendation for one of the events that we're logging in the growth experiment. I'd like to make some or all of this data public to enable "community analytics". Right now, there's technical and political hurdles we're working past.

When you consider "community experimentation" infrastructure, you should also think of https://meta.wikimedia.org/wiki/Grants:IEG.

-Aaron

On Wed, Sep 3, 2014 at 4:33 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:

No idea! I was talking about research and analytics tools. Research is providing support on the algorithms but the actual development is the E3 team, the openness of which Steven has already commented on above.

On 3 September 2014 10:25, Edward Saperia <ed@wikimanialondon.org> wrote:

Is this intended to be an open piece of infrastructure that anyone can edit?
https://www.mediawiki.org/wiki/Task_recommendations

Of course, you can say anything closed is something that just hasn't been made open yet, but that's exactly why I raise the issue.

On 3 September 2014 15:18, Oliver Keyes <okeyes@wikimedia.org> wrote:
Sure: point me to something I mentioned that's a closed product and isn't a prerequisite for an open one? :p

On Wednesday, 3 September 2014, Edward Saperia <ed@wikimanialondon.org> wrote:

In a movement like this with a lot of very active, very leveraged community activity, it seems to me that we should always be trying to make things that are infrastructure instead of closed products.

cc Halfak - one of the few talks I managed to attend at Wikimania was his talk on "Research as Infrastructure", which I thought made the case very well.

Edward Saperia
Conference Director Wikimania London
email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

On 3 September 2014 00:17, Jonathan Morgan <jmorgan@wikimedia.org> wrote:

I agree with you, Ed. Although I don't think that it's realistic to expect a product teamlike EE/Growth to create these open research tools. Their primary output is always going to be the shiny products, not the slightly-less-shiny infrastructure. Now Analytics, on the other hand.. (*coughs* and looks pointedly at Ironholds...).

Also, the next round of IEGs opened yesterday. There's probably a fundable project in what you describe, given a team with the right skill sets. I'd be happy to provide feedback on a proposal.

Cheers,
Jonathan

On Mon, Sep 1, 2014 at 8:16 AM, Edward Saperia <ed@wikimanialondon.org> wrote:

Sure, I understand how research is done.

However, you could feasibly create components that allow for a certain types of experiments and open up the analysis side to the community. I think this could be a lot more successful than you'd expect - the community has many smart people, and together we could decide and promote best practice across projects/experiments. They'd also be able to drive suggestions for what new components to implement to expand the experiment space, and more generally grow interest in the work of the EE team.

I understand that what you're doing now is quick and dirty and just trying to get something up and working, but I hope that longer term you have in mind the capability of the community to help you in this kind of endeavour. We're all keen to grow participation, and giving us tools to experiment ourselves will ultimately be more effective than anything you can do centrally.

Edward Saperia
Conference Director Wikimania London
email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 20:03, Oliver Keyes <okeyes@wikimedia.org> wrote:

Neither; the tools we have for running experiments are largely hand-build on an ad hoc basis. For data collection we have tools like eventlogging, although they require developer energy to integrate with [potential area of experimentation]. But for actually analysing the results it looks very different.

Let's use a couple of concrete examples: suppose we wanted to look at whether there was a statistically significant variation in whether or not people edited if we included a contributor tagline, versus didn't. We'd need to take the same set of pages, ideally, and run a controlled study around an A/B test.

So first we'd display one version of the site for 50% of the population and another for the other 50% (realistically we'd probably use smaller sets and give the vast majority of editors the default experience, but it's a hypothetical, so let's run with it). That would require developer energy. Then we'd set up some kind of logging to pipe back edit attempts and view attempts by [control sample/not control sample]. Also developer energy, although much less. Then, crucially, we'd have to actually do the analysis, which is not something that can be robustly generalised.

In this example we'd be looking for significance, so we'd be looking at using some kind of statistical hypothesis test. Those vary depending on what probability distributions the underlying population follows. So we need to work out what probability distribution is most appropriate, and then apply the test most appropriate to that distribution. And that's not something that can be automated through software. As a result, we get the data and then work out how to test for significance.

The alternate hypothesis would be something observational; you make the change and then compare the behaviour of people while the change is live to their behaviour before and after. This cuts out most of the developer cost but doesn't do anything for the research support or the ad-hoc code and tools that need to come with it.

On 26 August 2014 10:52, Edward Saperia <ed@wikimanialondon.org> wrote:

You mean, you don't have them yourselves, or you can't expose them?

Edward Saperia
Conference Director Wikimania London
email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

On 26 August 2014 15:46, Oliver Keyes <okeyes@wikimedia.org> wrote:

Except we don't have those tools. There are a lot of domains in the ecosystem where this kind of experimentation and targeting on a per-wiki or per-project basis, but we have a big gap around functionality and expertise to let us scientifically test the efficacy of their various implementations.

On 26 August 2014 10:34, Edward Saperia <ed@wikimanialondon.org> wrote:

There's no point in polling existing community members about functionality they will not see.

While I am a great supporter of your team's work, I'd just like to comment on the above;

Wiser community members are aware that they are part of a powerful ecosystem, and that taming this ecosystem is a far more leveraged pursuit than doing the work yourself. Creating additional endpoints for onboarding processes that you're exposing to new users should be something that all projects are excited to take part in, so hopefully you'd want to poll the community for the valuable "Yes, and..." responses you'll get.

If you find you don't get responses like this, perhaps you might want to consider re-framing your new functionality as open infrastructure that the rest of the community is invited to build on, for example maybe wikiprojects themselves could specify the suggestions that are shown to new editors who edit in their subject areas?

Given appropriate tools to track effectiveness, this could create a huge, open environment for experimentation that could find interesting solutions faster than any engineering department ever could on their own.

Edward Saperia
Conference Director Wikimania London
email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

--
Jonathan T. Morgan
Learning Strategist
Wikimedia Foundation
User:Jmorgan (WMF)

jmorgan@wikimedia.org

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

--
Sent from a portable device of Lovecraftian complexity.

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee

--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
EE mailing list
EE@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/ee