On Fri, Apr 27, 2012 at 1:48 AM, WereSpielChequers <werespielchequers@gmail.com> wrote:

I'm glad to see that you didn't hand out undeserved barnstars, but there should be ways to identify and "reward" other groups. The simplest would be to look at other deciles, classify a sample of editors into ones that deserved a barnstar and ones that didn't, and then do an A/B test amongst the "deserving".

But a word of warning, barnstars go from one editor to another, and often result in the recipient saying thanks on the awarders talkpage. My suspicion is that the perceived value of the barnstar will degrade if the awarder's talkpage is dominated by people saying thanks for their barnstar.

On a related note I'd be intrigued as to whether in this test you gave a personalised rationale for the barnstar - my suspicion is that unpersonalised barnstars coming from someone whose talkpage is littered with "thanks for the barnstar" threads will have less effect than personalised barnstars, and I doubt if you used many accounts to award the barnstars. Relatively simple tests to organise would be to identify groups of editors who have not yet received a barnstar but who have reverted a certain number of vandalisms or fixed a certain number of typos (we have specific barnstars for both types of activity).

WSC

On 26 April 2012 21:02, Chitu Okoli <Chitu.Okoli@concordia.ca> wrote:

-------- Message original --------

Sujet: Re: [Wiki-research-l] Experimental study of informal rewards in peer production

Date : Thu, 26 Apr 2012 15:50:44 -0400

De : Michael Restivo <mike.restivo@gmail.com>

Pour : Chitu Okoli <Chitu.Okoli@concordia.ca>, Research into Wikimedia content and communities <wiki-research-l@lists.wikimedia.org>

Hi Chitu,

Yes, your conjecture is spot-on. Here is a more detailed response that I sent to Joseph. I tried sending this to the wiki-research-l but the email keeps bouncing back to me. If you're interested and willing to share it with the list, that would be acceptable to me.

We thought about this question quite extensively and there are a few reasons why we sampled the top 1% (which we didn't get around to discussing in this brief paper). First, because of the high degree of contribution inequality in Wikipedia's editing community, we were primarily interested in how status rewards affect the all-important core of highly-active editors. There is also a lot of turn-over in the long tail of the distribution, and even among the most active editors, there is considerable heterogeneity. Focusing on the most active users ensured us sufficient statistical power. (Post-hoc power analysis suggests that our sample size would need to be several thousand users in the 80-90th percentiles, and several hundred in the 90-99th percentiles, to discern an effect of the same strength.) Also, we considered the question of construct validity: which users are deserving (so to speak) of receiving an editing award or social recognition of their work?

You are right that it should be fairly easy to extend this analysis beyond just the top 1%, but just how wide a net to cast remains a question. The issue of power calculation and sample size becomes increasingly difficult to manage for lower deciles because of the power-law distribution. And I don't think it would be very meaningful to assess the effect of barnstars on the bottom half of the distribution, for example, for the substantive reasons I mentioned above. Still, I'd be curious to hear what you think, and whether there might be some variations on this experiment that could overcome these limitations.

In terms of data dredging, that is always a concern and I completely understand where you are coming from. In fact, as both and author and consumer of scientific knowledge, I'm rarely ever completely satisfied. For example, a related concern that I have is the filing cabinet effect - when research produces null (or opposite) results and hence the authors decide to not attempt to have it published.

In this case, I actually started this project with the hunch that barnstars would lead to a slight decline in editing behavior; my rationale was that rewards would act as social markers that editors' past work was sufficient to earn social recognition and hence receiving such a reward would signal that the editor had "done enough" for the time being. In addition to there being substantial support for this idea in the economics literature, this intuition stemmed from hearing about an (unpublished) observational study of barnstars by Gueorgi Kossinets (formerly at Cornell, now at Google) that suggested editors receive barnstars at the peak of their editing activity. Of course, we chose an experimental design precisely to help us to tease out the causal direction as well as what effect barnstars have for recipients relative to their unrewarded counterparts. We felt like no matter what we found - either a positive, negative, or even no effect - it would have been interesting enough to publish, so hopefully that alleviates some of your concerns.

Please let me know if you have any other questions, and I'd love to hear your thoughts about potential follow-ups to this research.

Regards,

Michael

On Thu, Apr 26, 2012 at 3:30 PM, Chitu Okoli <Chitu.Okoli@concordia.ca> wrote:

One obvious issue is that it would be unethical to award barnstars to contributors who did not deserve them. However, the 1% most productive contributors, by definition, deserved the barnstars that the experimenter awarded them. Awarding barnstars to undeserving contributors for experimental purposes probably would not have flown so easily by the ethical review board. As the article notes:

----------
This study's research protocol was approved by the Committees on Research Involving Human Subjects (IRB) at the State University of New York at Stony Brook (CORIHS #2011-1394). Because the experiment presented only minimal risks to subjects, the IRB committee determined that obtaining prior informed consent from participants was not required.
----------

This is my conjecture; I'd like to hear the author's comments.

~ Chitu

-------- Message original --------
Sujet: [Wiki-research-l] Experimental study of informal rewards in peer production
De : Joseph Reagle <joseph.2011@reagle.org>
Pour : michael.restivo@stonybrook.edu
Copie à : Research into Wikimedia content and communities <wiki-research-l@lists.wikimedia.org>
Date : 26 Avril 2012 11:42:01

In this [study](http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0034358):

> Abstract: We test the effects of informal rewards in online peer production. Using a randomized, experimental design, we assigned editing awards or “barnstars” to a subset of the 1% most productive Wikipedia contributors. Comparison with the control group shows that receiving a barnstar increases productivity by 60% and makes contributors six times more likely to receive additional barnstars from other community members, revealing that informal rewards significantly impact individual effort.

I wonder why it is limited to the top 1%? I'd love to see the analysis repeated (should be trivial) on each decile. Besides satisfying my curiosity, some rationale and/or discussion of other deciles would also address any methodological concern about data dredging.

--

Michael Restivo
Department of Sociology
Social and Behavioral Sciences S-433
Stony Brook University
Stony Brook, NY 11794
mike.restivo@gmail.com

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Sujet:	Re: [Wiki-research-l] Experimental study of informal rewards in peer production
Date :	Thu, 26 Apr 2012 15:50:44 -0400
De :	Michael Restivo <mike.restivo@gmail.com>
Pour :	Chitu Okoli <Chitu.Okoli@concordia.ca>, Research into Wikimedia content and communities <wiki-research-l@lists.wikimedia.org>