Hello everyone.
It seems to me that this discussion is about at least two different types of data: (1) data that is collected in private conversations with users; and (2) data that is observed in the public Wikipedia. In preparing a policy, we should be clear about the differences in expectations around different types of data we might want to quote. We can consider quotes from:
1) From Wikipedia pages that are currently visible in the active history. 2) From IRC or other ephemeral media. 3) From Wikipedia pages that are no longer publicly visible. 4) Interviews with editors.
To me, these different types of data have very different expectations from the participants, and should be treated very differently. I would propose something like:
1) Cannot be quoted without explicit permission. Can only be summarized with care. Sources should be preserved to share with other researchers, but only under carefully managed circumstances. 2) Should usually not be quoted. Summaries should be careful not to be re-identifiable. Conversations may be preserved for sharing with other researchers in the future. 3) Can be quoted when necessary to explicate the research, but quotes should be selected judiciously. I the pages were deleted to protect privacy -- or remove libel! -- in a BLP, for instance, quotes should be selected very carefully. Links to, or copies of, the original should be kept for sharing with other researchers when appropriate. 4) Can be quoted without restriction. Researchers should be careful to not unnecessarily inflame conflict, but these are public data. Researchers may choose not to list names on the quotes, *but must realize that this provides no real protection*.
In response to Aaron's question about how to move forward, I believe we should write a proposed policy that seems reasonable to us, and make it visible to the community for comments.
Best, John
On Thu, Oct 28, 2010 at 4:21 PM, Aaron Halfaker aaron.halfaker@gmail.com wrote:
I think that finding answers to these questions (among others) will be important for writing a general research policy and advising researchers about how they should behave in Wikipedia/Wikimedia projects. How do we discover those answers? Do we ask members of the Wikipedia community to tell us? In which forum? Which language? Or should we instead assert what we think is reasonable and have the community tell us where we went wrong?
On Thu, Oct 28, 2010 at 3:44 PM, Ziko van Dijk zvandijk@googlemail.com wrote:
Hello,
Yes indeed, a good division of the question, Aaron,
Kind regards Ziko
2010/10/25 Aaron Halfaker aaron.halfaker@gmail.com:
Simply describing what took place in that conversation would limit the value it provides to a reader. I feel strongly that using examples to support analysis is a good way to inform and convince your readers, but I agree that de-anonymizing editors (whether posting in a public space or not) could be troublesome. I'd like to break this question down into a few parts.
Are Wikipedia's publicly accessible pages anything other than public spaces? Does bringing wide attention to editor actions cause them unnecessary harm? Should/do editors assume that wide attention will not be brought to their editing actions?
On Mon, Oct 25, 2010 at 10:12 AM, Milos Rancic millosh@gmail.com wrote:
You can still describe [your] point of conversation without quoting them, while leaving the conversation inside of the extended materials of your work, available on demand. If it matters, of course.
On Mon, Oct 25, 2010 at 16:33, Ziko van Dijk zvandijk@googlemail.com wrote:
Hello,
I may not have make clear what I mean. In the humanities, you often have to quote conversations. But when I do, I reveal automatically reveal who said what, even if it is not important for me what was said exactly by whom. Wikipedians tend to dislike these quote, because they can be used to name and shame, and bringing up old disputes, and maybe reveal the real identity when done at a large scale.
For example, I copied a piece of conversation here, and replaced the user names. Still, it is easy to retrieve the original conversation with the Search.
Kind regards Ziko
Surely this is just self promotion? I have seen othger articles about websites be delted as they are regarded as self-promotion. —UserA
The article tries to be balanced and neutral, and provide facts about the project and a summary of criticism directed at it without prejudice. How well it achieves this is of course open to interpretation – UserB
I'm very surprised that anyone would think Wikipedia should not have an article on itself. Imagine that someone wants to read a summary of this project. How would they do it? They would have to go through thousands of help pages, which is impossible. So they can come to this article and have the project explained in one place. The article has to be unbiassed of course and point out the flaws of the project as well as the strengths. As someone said elsewhere, you would surely expect to find the word "dictionary" in a dictionary so I would expect to find Wikipedia in Wikipedia! -UserC
2010/10/25 Milos Rancic millosh@gmail.com:
On Sun, Oct 24, 2010 at 21:30, Ziko van Dijk zvandijk@googlemail.com wrote: > There is one problem that I met several times with relation to Wiki > research. On the one hand, it is importand to reference to one's > "primary sources", e.g. link to a sentence in a WP discussion we > are > talking about. On the other hand, we want to preserve the privacy > of > our users. > > For example, in Leipzig a lecturer talked about a certain > discussion, > and he tried to keep it anonimously because it was about the > discussion itself, not to embarass the persons. But, the anonimity > was > soon destroyed by curious listeners, anyway. > > Did you already encounter that problem yourself?
The best way for keeping anonymity is to present derivative work, not source itself. It is regular part of scientific work in social sciences. And you guarantee for the validity by your scientific integrity, as well as by sources which you are keeping for yourself for possible check by other scientist who would keep the sources confidential, too.
For example, I remember one sociolinguistic research, where scientists taped speech of one Irish community. They gathered all private data and connected them with records. Then, they renamed subjects with letters: A, B, etc. At the end, they made a digest record which they are willing to show to other interested scientists just on demand, as well if I remember well, with a kind of confidentiality agreement with them. I think that the research is from 1970s.
So, you should combine all of the tools which you can have: derivative works, digesting, stating that the sources are confidential, finding other [trusting] scientist who would check your sources and guarantee for them, too.
And, yes, sources should be handled carefully. There are just a couple of thousands core Wikimedians and it is relatively easy to understand who is the person from the source.
RCom-l mailing list RCom-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/rcom-l
-- Ziko van Dijk Niederlande
RCom-l mailing list RCom-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/rcom-l
RCom-l mailing list RCom-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/rcom-l
RCom-l mailing list RCom-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/rcom-l
-- Ziko van Dijk Niederlande
RCom-l mailing list RCom-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/rcom-l
RCom-l mailing list RCom-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/rcom-l