I Open licensing. Anyone who wants to broadcast
research surveys to our
editing community needs to agree that the anonymised results of those
surveys will be available under cc-by-sa, and not just a statistical
digest
but the actual dataset so that variables can be cross
tabbed. But I can
live with the researcher(s) also having a copy of the data under a
different
copyright if they are narrowcasting to a small group of editors rather
than
broadcasting to a large group.
Now it seems to me that the disagreement between you and Aaron during the
last RCom meeting was a misunderstanding. I would agree (and, hopefully,
Aaron would agree too) that the RAW results of the survey should be
cc-by-sa licenced. On the other hand, the results of the research itself
(by this I mean the analysis and conclusions the researchers make from the
raw results, and/or eventually manuscripts) should be open access, but I
would not require the cc-by-sa for the manuscript.
II Timeliness. The cc-by-sa anonymised dataset needs
to be published
pretty
much as soon as it could be, and not kept back until
after the
researcher
has published their analysis of it.
May be some fixed period would be reasonable? Let us say one month after
the end of the survey? This is enough to analyze the data without fearing
the competition from other researchers.
III Transparency. The nightmare scenario to me would be if a top
thousand
website or aspirant:
1. Sponsors some Academics to do research in an area where they are
having difficulty or want to improve their own online community.
2. Sponsors Wikimedia (most of our money comes from individuals, but
sometimes a company gives us a few thousand dollars)
3. Their sponsored researcher has private discussions with some or
all
of
us, and gets dispensation not to release part or all of the data they
collect in a way that would enable their sponsor's competitors to get
the
same benefit of it.
4. Either they attribute part of their subsequent turnaround to
"insights
achieved via research sponsored on Wikimedia", or someone
independently
links the three previous points and accuses the WMF
of selling
research
access to its editorship, and selling it cheaply.
So far the only argument I've seen for confidentiality is that
researchers
don't want the data subjects to have a preview of
the questions as that
could skew the results. I'd accept that as reasonable, if a bit tenuous
-
the chance of there being a significant overlap
between this list and
any
conceivable research sample is low. But it could be
resolved by holding
the
discussion on an Email thread that doesn't get
posted until after the
surveys are posted.
I think pooling to share the questions (as we discussed at the meeting)
would be possible without actually disclose the questions to the public -
we just need to find a proper way to do it, like may be OTRS. I think
asking about sponsorship is pretty much possible and reasonable.
Cheers
Yaroslav