[Foundation-l] Attribution survey, first results

List overview All Threads
Download

newer

older

[Foundation-l] Report a problem...

[Foundation-l] Attribution survey,...

Gregory Kohs

4 Mar 2009 4 Mar '09

9:32 p.m.

*phoebe ayers* phoebe.wiki at gmail.com writes: ++++++ I'm not sure there's any way to get a non-self-selected survey about anything on the projects due to anonymity concerns. ++++++ I'm a 17-year veteran of implementing professional quantitative survey research. Self-selection bias is a very complicated study, but there are some fairly accessible and intuitive techniques one may implement to create a thoughtful survey of a target population which minimizes self-selection bias concerns. This allows the stakeholders to focus on the challenge of deriving meaning from the response data rather than feeling nausea over the sampling methodology. I am willing to give, pro bono, 45 minutes of telephone consulting time to any Wikimedia Foundation staff member who is attached to this particular survey project, on the condition that they will be open and attentive to the possibility that a properly-designed and fairly-executed survey may not return results that foster their preconceived desires to railroad through a license migration (which, unfortunately, is my key takeaway from observing this discussion). -- Gregory Kohs Cell: 302.463.1354

Show replies by date

Brian

4 Mar 4 Mar

9:47 p.m.

This entire field has been formalized but in my experience the key things to worry about are experimenter and subject bias. Experimenter bias in a survey context means that the survey writer (Erik) has expectations about the likely community answers. This has been clearly demonstrated, as he already has a feeling about what the German survey results will be even though it hasn't been written. Writing an unbiased survey requires very careful wording and is a tough job. If the entire point of the survey is to find out what the community thinks then the survey should be unbiased. A variety of types of subject bias are overcome by taking a random sample. The claim that the survey takers are self selected is overcome by also recording various demographic information and normalizing the number of responses from demographics, or some other kind of filter. You essentially need to employ psychometric techniques in order to verify the construct validity of the survey (that you can really draw those inferences from those questions). Erik's survey, in my opinion, is likely to have low construct validity and should have been created by a blind, relatively unbiased 3rd party instead. Creating a survey in which the subjects are non-self-selected is a practical impossibility. I can think of some software methods that might help but the better solution is to gather rich demographics and then filter. On Wed, Mar 4, 2009 at 1:32 PM, Gregory Kohs <thekohser(a)gmail.com> wrote:

...

Nathan

9:53 p.m.

As a non-statistician (and, from this list, you'd think there are lots of professional statisticians participating...), can one of the experts explain the practical implications of the bias of this survey? It seems fairly informal, intended perhaps to be food for thought but not a definitive answer. Is this survey sufficiently accurate (i.e., accurate in a very broad way) to serve its purpose? How much will problems with methodology (which I'm sure Erik knew would be pointed out immediately) distort the results? Nathan On Wed, Mar 4, 2009 at 3:47 PM, Brian <Brian.Mingus(a)colorado.edu> wrote:

...

*phoebe ayers* phoebe.wiki at gmail.com writes: ++++++ I'm not sure there's any way to get a non-self-selected survey about

anything

on the projects due to anonymity concerns. ++++++ I'm a 17-year veteran of implementing professional quantitative survey research. Self-selection bias is a very complicated study, but there are some fairly accessible and intuitive techniques one may implement to

create

a thoughtful survey of a target population which minimizes self-selection bias concerns. This allows the stakeholders to focus on the challenge of deriving meaning from the response data rather than feeling nausea over

the

sampling methodology. I am willing to give, pro bono, 45 minutes of telephone consulting time

any Wikimedia Foundation staff member who is attached to this particular survey project, on the condition that they will be open and attentive to

the

possibility that a properly-designed and fairly-executed survey may not return results that foster their preconceived desires to railroad through

license migration (which, unfortunately, is my key takeaway from

observing

this discussion). -- Gregory Kohs Cell: 302.463.1354 _______________________________________________ foundation-l mailing list foundation-l(a)lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

_______________________________________________ foundation-l mailing list foundation-l(a)lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

-- Your donations keep Wikipedia running! Support the Wikimedia Foundation today: http://www.wikimediafoundation.org/wiki/Donate

Ryan Kaldari

10:04 p.m.

The official results of the survey haven't even been announced yet, and already it is being accused of bias. Have any of you actually looked at the survey? It does include demographic questions and it's a ranked preference poll. If someone were trying to skew the results in a particular way, this survey would be a pretty poor way to attempt it. Ryan Kaldari On Wed, Mar 4, 2009 at 2:53 PM, Nathan <nawrich(a)gmail.com> wrote:

...

*phoebe ayers* phoebe.wiki at gmail.com writes: ++++++ I'm not sure there's any way to get a non-self-selected survey about

anything

create

the

sampling methodology. I am willing to give, pro bono, 45 minutes of telephone consulting time

any Wikimedia Foundation staff member who is attached to this particular survey project, on the condition that they will be open and attentive to

the

possibility that a properly-designed and fairly-executed survey may not return results that foster their preconceived desires to railroad through

license migration (which, unfortunately, is my key takeaway from

observing

_______________________________________________ foundation-l mailing list foundation-l(a)lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

-- Your donations keep Wikipedia running! Support the Wikimedia Foundation today: http://www.wikimediafoundation.org/wiki/Donate _______________________________________________ foundation-l mailing list foundation-l(a)lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

phoebe ayers

10:18 p.m.

For what it's worth, what Nathan says basically sums up my concerns as well. I think for a (relatively informal, community-opinion) survey it's less important to have an absolutely rigorous methodology (not what I was asking for) than it is to ask the question: is this good enough for our purposes? (and indeed, what *are* our purposes, and how does that influence what we ask?) Saying that community opinion should be taken into account on this question is wonderful, and crucial -- but as we all know it's damn hard to determine community opinion with any degree of reliability. Devoting some thought to this non-trivial matter has useful implications for determining *all sorts* of controversial, broad-scale questions, however, and getting it right means that we are one step closer to better community governance. Or if we can't get it "right", let's acknowledge what the biases are, and be very clear on the kinds of input that did go into this conversation. For instance, many of the people who have participated in the GFDL rewrite and the discussion so far are some of the preeminent free-content, free-culture, open-knowledge experts in the world: that should be acknowledged. There are many more potential constituencies that haven't had a say, however. For instance, a while back I polled a handful of librarian colleagues who are occasional Wikipedia contributors about their thoughts on attribution, just for my own edification. Obviously, the plural of anecdote is not data, but I still found their anecdotes interesting. These are all people who know something about copyright and quite a bit about 'attribution' in the academic world (our job, as librarians, is often to advise people on how to provide proper credit to sources). They were all firmly against the list-all-authors method of attribution. One said: "I expect no personal attribution whatsoever for work on WP. The point of WP is that it is a communal/communitarian encyclopedia. To give credit to individual author defeats that aim. Further, pages evolve, even if some given selection of articles wind up printed. To identify authors as of 2009 ignores the work that will almost certainly come later, and it implicitly devalues that later work by giving primacy to the people who got the ball rolling on an article." This is a strong and interesting opinion that as far as I know hasn't even been expressed in quite that way on this mailing list. Part of my questioning the survey is because its design explicitly excludes the opinions of people like my friend, who edits under an IP afaik. -- Phoebe On Wed, Mar 4, 2009 at 12:53 PM, Nathan <nawrich(a)gmail.com> wrote:

...

Brian

10:28 p.m.

On Wed, Mar 4, 2009 at 2:18 PM, phoebe ayers <phoebe.wiki(a)gmail.com> wrote:

...

Part of my questioning the survey is because its design explicitly excludes the opinions of people like my friend, who edits under an IP afaik.

If they didn't include *all* visitors to the site then it really is a biased sample. Collect from everyone, but also collect demographics. Its the only way to do this right.

...

On Wed, Mar 4, 2009 at 12:53 PM, Nathan <nawrich(a)gmail.com> wrote: > Is this survey sufficiently accurate (i.e., accurate in a very broad > way) to serve its purpose? How much will problems with methodology (which > I'm sure Erik knew would be pointed out immediately) distort the results?

Being informal about a survey is a slippery slope. The risk is that your conclusions simply do not follow from the premises, and thus that you haven't actually gauged community opinion. Since, 1) creating unbiased survey software is a one time cost 2) the importance of the decisions being made based on the survey is very high 3) nobody wishes to distort community opinion it must be worth it to do it correctly. Isn't Mike Godwin an ex-statistician? Pair him up with a developer and they'll be done in a day. Probably doesn't fit the job description though :)

Phil Nash

9:50 p.m.

Gregory Kohs wrote:

...

> *phoebe ayers* phoebe.wiki at gmail.com writes: > > ++++++ > I'm not sure there's any way to get a non-self-selected survey about > anything on the projects due to anonymity concerns. > ++++++ > > I'm a 17-year veteran of implementing professional quantitative > survey research. Self-selection bias is a very complicated study, > but there are some fairly accessible and intuitive techniques one > may implement to create a thoughtful survey of a target population > which minimizes self-selection bias concerns. This allows the > stakeholders to focus on the challenge of deriving meaning from the > response data rather than feeling nausea over the sampling > methodology. > > I am willing to give, pro bono, 45 minutes of telephone consulting > time to any Wikimedia Foundation staff member who is attached to > this particular survey project, on the condition that they will be > open and attentive to the possibility that a properly-designed and > fairly-executed survey may not return results that foster their > preconceived desires

Except of course, that such a survey would arguably not have "preconceived desires". So much for empiricism!

5529

days inactive

5529

days old

wikimedia-l@lists.wikimedia.org

Manage subscription

6 comments

6 participants

tags (0)

participants (6)

Brian
Gregory Kohs
Nathan
Phil Nash
phoebe ayers
Ryan Kaldari