-------- Message original -------- Sujet: Re: [Wiki-research-l] Experimental study of informal rewards in peer production Date : Thu, 26 Apr 2012 15:50:44 -0400 De : Michael Restivo mike.restivo@gmail.com Pour : Chitu Okoli Chitu.Okoli@concordia.ca, Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org
Hi Chitu,
Yes, your conjecture is spot-on. Here is a more detailed response that I sent to Joseph. I tried sending this to the wiki-research-l but the email keeps bouncing back to me. If you're interested and willing to share it with the list, that would be acceptable to me.
We thought about this question quite extensively and there are a few reasons why we sampled the top 1% (which we didn't get around to discussing in this brief paper). First, because of the high degree of contribution inequality in Wikipedia's editing community, we were primarily interested in how status rewards affect the all-important core of highly-active editors. There is also a lot of turn-over in the long tail of the distribution, and even among the most active editors, there is considerable heterogeneity. Focusing on the most active users ensured us sufficient statistical power. (Post-hoc power analysis suggests that our sample size would need to be several thousand users in the 80-90th percentiles, and several hundred in the 90-99th percentiles, to discern an effect of the same strength.) Also, we considered the question of construct validity: which users are deserving (so to speak) of receiving an editing award or social recognition of their work?
You are right that it should be fairly easy to extend this analysis beyond just the top 1%, but just how wide a net to cast remains a question. The issue of power calculation and sample size becomes increasingly difficult to manage for lower deciles because of the power-law distribution. And I don't think it would be very meaningful to assess the effect of barnstars on the bottom half of the distribution, for example, for the substantive reasons I mentioned above. Still, I'd be curious to hear what you think, and whether there might be some variations on this experiment that could overcome these limitations.
In terms of data dredging, that is always a concern and I completely understand where you are coming from. In fact, as both and author and consumer of scientific knowledge, I'm rarely ever completely satisfied. For example, a related concern that I have is the filing cabinet effect - when research produces null (or opposite) results and hence the authors decide to not attempt to have it published.
In this case, I actually started this project with the hunch that barnstars would lead to a slight decline in editing behavior; my rationale was that rewards would act as social markers that editors' past work was sufficient to earn social recognition and hence receiving such a reward would signal that the editor had "done enough" for the time being. In addition to there being substantial support for this idea in the economics literature, this intuition stemmed from hearing about an (unpublished) observational study of barnstars by Gueorgi Kossinets (formerly at Cornell, now at Google) that suggested editors receive barnstars at the peak of their editing activity. Of course, we chose an experimental design precisely to help us to tease out the causal direction as well as what effect barnstars have for recipients relative to their unrewarded counterparts. We felt like no matter what we found - either a positive, negative, or even no effect - it would have been interesting enough to publish, so hopefully that alleviates some of your concerns.
Please let me know if you have any other questions, and I'd love to hear your thoughts about potential follow-ups to this research.
Regards, Michael
On Thu, Apr 26, 2012 at 3:30 PM, Chitu Okoli <Chitu.Okoli@concordia.ca mailto:Chitu.Okoli@concordia.ca> wrote:
One obvious issue is that it would be unethical to award barnstars to contributors who did not deserve them. However, the 1% most productive contributors, by definition, deserved the barnstars that the experimenter awarded them. Awarding barnstars to undeserving contributors for experimental purposes probably would not have flown so easily by the ethical review board. As the article notes:
---------- This study's research protocol was approved by the Committees on Research Involving Human Subjects (IRB) at the State University of New York at Stony Brook (CORIHS #2011-1394). Because the experiment presented only minimal risks to subjects, the IRB committee determined that obtaining prior informed consent from participants was not required. ----------
This is my conjecture; I'd like to hear the author's comments.
~ Chitu
-------- Message original -------- Sujet: [Wiki-research-l] Experimental study of informal rewards in peer production De : Joseph Reagle <joseph.2011@reagle.org mailto:joseph.2011@reagle.org> Pour : michael.restivo@stonybrook.edu mailto:michael.restivo@stonybrook.edu Copie à : Research into Wikimedia content and communities <wiki-research-l@lists.wikimedia.org mailto:wiki-research-l@lists.wikimedia.org> Date : 26 Avril 2012 11:42:01
In this [study](http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0034358):
> Abstract: We test the effects of informal rewards in online peer production. Using a randomized, experimental design, we assigned editing awards or “barnstars” to a subset of the 1% most productive Wikipedia contributors. Comparison with the control group shows that receiving a barnstar increases productivity by 60% and makes contributors six times more likely to receive additional barnstars from other community members, revealing that informal rewards significantly impact individual effort.
I wonder why it is limited to the top 1%? I'd love to see the analysis repeated (should be trivial) on each decile. Besides satisfying my curiosity, some rationale and/or discussion of other deciles would also address any methodological concern about data dredging.
--
Michael Restivo Department of Sociology Social and Behavioral Sciences S-433 Stony Brook University Stony Brook, NY 11794 mike.restivo@gmail.com mailto:mike.restivo@gmail.com
I'm glad to see that you didn't hand out undeserved barnstars, but there should be ways to identify and "reward" other groups. The simplest would be to look at other deciles, classify a sample of editors into ones that deserved a barnstar and ones that didn't, and then do an A/B test amongst the "deserving".
But a word of warning, barnstars go from one editor to another, and often result in the recipient saying thanks on the awarders talkpage. My suspicion is that the perceived value of the barnstar will degrade if the awarder's talkpage is dominated by people saying thanks for their barnstar.
On a related note I'd be intrigued as to whether in this test you gave a personalised rationale for the barnstar - my suspicion is that unpersonalised barnstars coming from someone whose talkpage is littered with "thanks for the barnstar" threads will have less effect than personalised barnstars, and I doubt if you used many accounts to award the barnstars. Relatively simple tests to organise would be to identify groups of editors who have not yet received a barnstar but who have reverted a certain number of vandalisms or fixed a certain number of typos (we have specific barnstars for both types of activity).
WSC
On 26 April 2012 21:02, Chitu Okoli Chitu.Okoli@concordia.ca wrote:
-------- Message original -------- Sujet: Re: [Wiki-research-l] Experimental study of informal rewards in peer production Date : Thu, 26 Apr 2012 15:50:44 -0400 De : Michael Restivo mike.restivo@gmail.commike.restivo@gmail.com Pour : Chitu Okoli Chitu.Okoli@concordia.ca Chitu.Okoli@concordia.ca, Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.orgwiki-research-l@lists.wikimedia.org
Hi Chitu,
Yes, your conjecture is spot-on. Here is a more detailed response that I sent to Joseph. I tried sending this to the wiki-research-l but the email keeps bouncing back to me. If you're interested and willing to share it with the list, that would be acceptable to me.
We thought about this question quite extensively and there are a few
reasons why we sampled the top 1% (which we didn't get around to discussing in this brief paper). First, because of the high degree of contribution inequality in Wikipedia's editing community, we were primarily interested in how status rewards affect the all-important core of highly-active editors. There is also a lot of turn-over in the long tail of the distribution, and even among the most active editors, there is considerable heterogeneity. Focusing on the most active users ensured us sufficient statistical power. (Post-hoc power analysis suggests that our sample size would need to be several thousand users in the 80-90th percentiles, and several hundred in the 90-99th percentiles, to discern an effect of the same strength.) Also, we considered the question of construct validity: which users are deserving (so to speak) of receiving an editing award or social recognition of their work?
You are right that it should be fairly easy to extend this analysis beyond
just the top 1%, but just how wide a net to cast remains a question. The issue of power calculation and sample size becomes increasingly difficult to manage for lower deciles because of the power-law distribution. And I don't think it would be very meaningful to assess the effect of barnstars on the bottom half of the distribution, for example, for the substantive reasons I mentioned above. Still, I'd be curious to hear what you think, and whether there might be some variations on this experiment that could overcome these limitations.
In terms of data dredging, that is always a concern and I completely
understand where you are coming from. In fact, as both and author and consumer of scientific knowledge, I'm rarely ever completely satisfied. For example, a related concern that I have is the filing cabinet effect - when research produces null (or opposite) results and hence the authors decide to not attempt to have it published.
In this case, I actually started this project with the hunch that barnstars would lead to a slight decline in editing behavior; my rationale was that rewards would act as social markers that editors' past work was sufficient to earn social recognition and hence receiving such a reward would signal that the editor had "done enough" for the time being. In addition to there being substantial support for this idea in the economics literature, this intuition stemmed from hearing about an (unpublished) observational study of barnstars by Gueorgi Kossinets (formerly at Cornell, now at Google) that suggested editors receive barnstars at the peak of their editing activity. Of course, we chose an experimental design precisely to help us to tease out the causal direction as well as what effect barnstars have for recipients relative to their unrewarded counterparts. We felt like no matter what we found - either a positive, negative, or even no effect - it would have been interesting enough to publish, so hopefully that alleviates some of your concerns.
Please let me know if you have any other questions, and I'd love to hear
your thoughts about potential follow-ups to this research.
Regards, Michael
On Thu, Apr 26, 2012 at 3:30 PM, Chitu Okoli Chitu.Okoli@concordia.cawrote:
One obvious issue is that it would be unethical to award barnstars to contributors who did not deserve them. However, the 1% most productive contributors, by definition, deserved the barnstars that the experimenter awarded them. Awarding barnstars to undeserving contributors for experimental purposes probably would not have flown so easily by the ethical review board. As the article notes:
This study's research protocol was approved by the Committees on Research Involving Human Subjects (IRB) at the State University of New York at Stony Brook (CORIHS #2011-1394). Because the experiment presented only minimal risks to subjects, the IRB committee determined that obtaining prior informed consent from participants was not required.
This is my conjecture; I'd like to hear the author's comments.
~ Chitu
-------- Message original -------- Sujet: [Wiki-research-l] Experimental study of informal rewards in peer production De : Joseph Reagle joseph.2011@reagle.org Pour : michael.restivo@stonybrook.edu Copie à : Research into Wikimedia content and communities < wiki-research-l@lists.wikimedia.org> Date : 26 Avril 2012 11:42:01
In this [study]( http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0034358 ):
Abstract: We test the effects of informal rewards in online peer
production. Using a randomized, experimental design, we assigned editing awards or “barnstars” to a subset of the 1% most productive Wikipedia contributors. Comparison with the control group shows that receiving a barnstar increases productivity by 60% and makes contributors six times more likely to receive additional barnstars from other community members, revealing that informal rewards significantly impact individual effort.
I wonder why it is limited to the top 1%? I'd love to see the analysis repeated (should be trivial) on each decile. Besides satisfying my curiosity, some rationale and/or discussion of other deciles would also address any methodological concern about data dredging.
--
Michael Restivo Department of Sociology Social and Behavioral Sciences S-433 Stony Brook University Stony Brook, NY 11794 mike.restivo@gmail.com
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Following the previous comment, I'm very much curious to see one of those given barnstars. Is it ensured that users are not aware of the experiment? how personalised are the awards? May be the awarded editors think their activity is being observed by some organisation, and it pretty much explains why they become more active after receiving the award!
bests, .Taha
On Fri, Apr 27, 2012 at 1:48 AM, WereSpielChequers < werespielchequers@gmail.com> wrote:
I'm glad to see that you didn't hand out undeserved barnstars, but there should be ways to identify and "reward" other groups. The simplest would be to look at other deciles, classify a sample of editors into ones that deserved a barnstar and ones that didn't, and then do an A/B test amongst the "deserving".
But a word of warning, barnstars go from one editor to another, and often result in the recipient saying thanks on the awarders talkpage. My suspicion is that the perceived value of the barnstar will degrade if the awarder's talkpage is dominated by people saying thanks for their barnstar.
On a related note I'd be intrigued as to whether in this test you gave a personalised rationale for the barnstar - my suspicion is that unpersonalised barnstars coming from someone whose talkpage is littered with "thanks for the barnstar" threads will have less effect than personalised barnstars, and I doubt if you used many accounts to award the barnstars. Relatively simple tests to organise would be to identify groups of editors who have not yet received a barnstar but who have reverted a certain number of vandalisms or fixed a certain number of typos (we have specific barnstars for both types of activity).
WSC
On 26 April 2012 21:02, Chitu Okoli Chitu.Okoli@concordia.ca wrote:
-------- Message original -------- Sujet: Re: [Wiki-research-l] Experimental study of informal rewards in peer production Date : Thu, 26 Apr 2012 15:50:44 -0400 De : Michael Restivo mike.restivo@gmail.commike.restivo@gmail.com Pour : Chitu Okoli Chitu.Okoli@concordia.ca Chitu.Okoli@concordia.ca, Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.orgwiki-research-l@lists.wikimedia.org
Hi Chitu,
Yes, your conjecture is spot-on. Here is a more detailed response that I sent to Joseph. I tried sending this to the wiki-research-l but the email keeps bouncing back to me. If you're interested and willing to share it with the list, that would be acceptable to me.
We thought about this question quite extensively and there are a few
reasons why we sampled the top 1% (which we didn't get around to discussing in this brief paper). First, because of the high degree of contribution inequality in Wikipedia's editing community, we were primarily interested in how status rewards affect the all-important core of highly-active editors. There is also a lot of turn-over in the long tail of the distribution, and even among the most active editors, there is considerable heterogeneity. Focusing on the most active users ensured us sufficient statistical power. (Post-hoc power analysis suggests that our sample size would need to be several thousand users in the 80-90th percentiles, and several hundred in the 90-99th percentiles, to discern an effect of the same strength.) Also, we considered the question of construct validity: which users are deserving (so to speak) of receiving an editing award or social recognition of their work?
You are right that it should be fairly easy to extend this analysis
beyond just the top 1%, but just how wide a net to cast remains a question. The issue of power calculation and sample size becomes increasingly difficult to manage for lower deciles because of the power-law distribution. And I don't think it would be very meaningful to assess the effect of barnstars on the bottom half of the distribution, for example, for the substantive reasons I mentioned above. Still, I'd be curious to hear what you think, and whether there might be some variations on this experiment that could overcome these limitations.
In terms of data dredging, that is always a concern and I completely
understand where you are coming from. In fact, as both and author and consumer of scientific knowledge, I'm rarely ever completely satisfied. For example, a related concern that I have is the filing cabinet effect - when research produces null (or opposite) results and hence the authors decide to not attempt to have it published.
In this case, I actually started this project with the hunch that barnstars would lead to a slight decline in editing behavior; my rationale was that rewards would act as social markers that editors' past work was sufficient to earn social recognition and hence receiving such a reward would signal that the editor had "done enough" for the time being. In addition to there being substantial support for this idea in the economics literature, this intuition stemmed from hearing about an (unpublished) observational study of barnstars by Gueorgi Kossinets (formerly at Cornell, now at Google) that suggested editors receive barnstars at the peak of their editing activity. Of course, we chose an experimental design precisely to help us to tease out the causal direction as well as what effect barnstars have for recipients relative to their unrewarded counterparts. We felt like no matter what we found - either a positive, negative, or even no effect - it would have been interesting enough to publish, so hopefully that alleviates some of your concerns.
Please let me know if you have any other questions, and I'd love to hear
your thoughts about potential follow-ups to this research.
Regards, Michael
On Thu, Apr 26, 2012 at 3:30 PM, Chitu Okoli Chitu.Okoli@concordia.cawrote:
One obvious issue is that it would be unethical to award barnstars to contributors who did not deserve them. However, the 1% most productive contributors, by definition, deserved the barnstars that the experimenter awarded them. Awarding barnstars to undeserving contributors for experimental purposes probably would not have flown so easily by the ethical review board. As the article notes:
This study's research protocol was approved by the Committees on Research Involving Human Subjects (IRB) at the State University of New York at Stony Brook (CORIHS #2011-1394). Because the experiment presented only minimal risks to subjects, the IRB committee determined that obtaining prior informed consent from participants was not required.
This is my conjecture; I'd like to hear the author's comments.
~ Chitu
-------- Message original -------- Sujet: [Wiki-research-l] Experimental study of informal rewards in peer production De : Joseph Reagle joseph.2011@reagle.org Pour : michael.restivo@stonybrook.edu Copie à : Research into Wikimedia content and communities < wiki-research-l@lists.wikimedia.org> Date : 26 Avril 2012 11:42:01
In this [study]( http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0034358 ):
Abstract: We test the effects of informal rewards in online peer
production. Using a randomized, experimental design, we assigned editing awards or “barnstars” to a subset of the 1% most productive Wikipedia contributors. Comparison with the control group shows that receiving a barnstar increases productivity by 60% and makes contributors six times more likely to receive additional barnstars from other community members, revealing that informal rewards significantly impact individual effort.
I wonder why it is limited to the top 1%? I'd love to see the analysis repeated (should be trivial) on each decile. Besides satisfying my curiosity, some rationale and/or discussion of other deciles would also address any methodological concern about data dredging.
--
Michael Restivo Department of Sociology Social and Behavioral Sciences S-433 Stony Brook University Stony Brook, NY 11794 mike.restivo@gmail.com
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
On Fri, May 4, 2012 at 11:33 AM, Taha Yasseri taha.yaseri@gmail.com wrote:
Following the previous comment, I'm very much curious to see one of those given barnstars. Is it ensured that users are not aware of the experiment? how personalised are the awards? May be the awarded editors think their activity is being observed by some organisation, and it pretty much explains why they become more active after receiving the award!
bests, .Taha
There is some more detail about this study in the recent research section of The Signpost: https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2012-04-30/Recent...
Thank you Steven, If the mentioned IP, who has distributed the barnstars is related to this research, one should be really careful on interpreting the results, as all we know, it's not the normal workflow in WP to award the editors. bests, .taha
On Fri, May 4, 2012 at 9:41 PM, Steven Walling swalling@wikimedia.orgwrote:
On Fri, May 4, 2012 at 11:33 AM, Taha Yasseri taha.yaseri@gmail.comwrote:
Following the previous comment, I'm very much curious to see one of those given barnstars. Is it ensured that users are not aware of the experiment? how personalised are the awards? May be the awarded editors think their activity is being observed by some organisation, and it pretty much explains why they become more active after receiving the award!
bests, .Taha
There is some more detail about this study in the recent research section of The Signpost: https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2012-04-30/Recent...
-- Steven Walling https://wikimediafoundation.org/
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
[ Apologies that I'm late to the party. I attempted to send this some time ago but from the "wrong" email address and didn't notice when it bounced. ]
<quote who="Chitu Okoli" date="Thu, Apr 26, 2012 at 04:02:39PM -0400">
In this case, I actually started this project with the hunch that barnstars would lead to a slight decline in editing behavior; my rationale was that rewards would act as social markers that editors' past work was sufficient to earn social recognition and hence receiving such a reward would signal that the editor had "done enough" for the time being. In addition to there being substantial support for this idea in the economics literature, this intuition stemmed from hearing about an (unpublished) observational study of barnstars by Gueorgi Kossinets (formerly at Cornell, now at Google) that suggested editors receive barnstars at the peak of their editing activity.
Aaron Shaw, Yochai Benkler and I have a working paper on barnstars as well which consider them observationally and find support for this finding. This is complicated by the fact that barnstars are endogenous to editing behavior (outside of Restivo's study); people don't get barnstars randomly. :) It makes sense that folks will be congratulated for doing work at the point at which they are, on average, doing the most and should be surprised if this is not the case.
In part because of this fact, my own work hasn't focused on effect of the awards but on how we can use them to get at sub-population differences among recipients.
If folks are are curious, Aaron and I will be presenting some of this work at Wikiamania as part of this session:
https://wikimania2012.wikimedia.org/wiki/Submissions/Can_Social_Awards_Creat...
If he can make it, Michael Restivo might present the experiment as part of that session as well.
Regards, Mako
<quote who="Chitu Okoli" date="Thu, Apr 26, 2012 at 04:02:39PM -0400"> > In this case, I actually started this project with the hunch that > barnstars would lead to a slight decline in editing behavior;
That's the projected conclusion from e.g. Alfie Kohn, http://naggum.no/motivation.html also included in Emacs distros as ./etc/MOTIVATION
It may be important that the barnstars are ephemeral/symbolic - if they could be traded for cash they might have less of an effect.
Still, Kohn extends the 'critique' (if it is one) to symbolic praise (viz symbolic violence) http://www.alfiekohn.org/parenting/gj.htm
(Note, I realize most WP top editors are not children, and the logic of praise may be very different in different age groups.)
This may be a cultural thing, but some people value symbolic prizes over cash ones.
Here in the UK the most prestigious quiz show has only one prize per series - a cut glass bowl. http://en.wikipedia.org/wiki/Mastermind_%28TV_series%29
Other shows may make you rich, but they don't have the same cachet.
That's why I think it is important to look at who gives the barnstar or whatever, how relevant the award is to the best work that the recipient does and how it is presented. My suspicion of the recent Barnstar test is that when recipients went to the awarders talkpage to see who'd given them that barnstar, and frequently to say thanks, the number of others doing the same may have devalued the barnstar. If I'm right the earlier recipients will perform differently to the later ones when adjusted for chronology.
An interesting alternative would be to get a bunch of experienced editors to each award a small number of barnstars.
WSC
On 7 May 2012 13:23, Joe Corneli holtzermann17@gmail.com wrote:
<quote who="Chitu Okoli" date="Thu, Apr 26, 2012 at 04:02:39PM -0400"> > In this case, I actually started this project with the hunch that > barnstars would lead to a slight decline in editing behavior;
That's the projected conclusion from e.g. Alfie Kohn, http://naggum.no/motivation.html also included in Emacs distros as ./etc/MOTIVATION
It may be important that the barnstars are ephemeral/symbolic - if they could be traded for cash they might have less of an effect.
Still, Kohn extends the 'critique' (if it is one) to symbolic praise (viz symbolic violence) http://www.alfiekohn.org/parenting/gj.htm
(Note, I realize most WP top editors are not children, and the logic of praise may be very different in different age groups.)
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org