Greeting Christina!
Thanks for sharing this and notifying us on the list.
Overall, I am very supportive of additional attempts to do more rigorous
survey research on Wikipedians. Some questions that I think you could try
to address in the proposal:
- *Sampling*: You mention that you plan to stratify your sample based on
past edit history and recruit via talk page messages. However, beyond this
you say nothing about the logistics of subject sampling, recruitment, or
any approaches you will take to address the fact that conducting
representative surveys in online communities is very, very difficult. Can
you elaborate on this aspect of your study? In particular, how will your
approach address shortcomings in data and sample quality that have affected
previous surveys
<http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0065782>
of Wikipedia contributors?
- *Self-report measures of edit history: *Why ask the respondents to
self-report their edit histories (this kind of thing is notoriously hard to
do accurately) when you could ask them to provide their usernames or at
least link their usernames to their survey responses (since you're
recruiting via talk page messages anyway)?
- *Collaboration w related studies: *There are several other ongoing
efforts to survey wikipedians -- even at least one other one
<http://dl.acm.org/citation.cfm?id=2145264>(link to "official" publication
is gated, but other versions are available for free) focused on social
psychological concerns. Also, my impression is that the WMF is involved in
planning another editor survey in the near future. How will your approach
complement/extend/overlap with these other efforts? Will you make any
effort to collaborate with these ongoing studies? How will your study avoid
subject exhaustion -- especially among more active wikipedians who may find
themselves invited to participate in many surveys?
- *Missing measures and missing people:* Previous studies have shown that a
variety of additional factors may figure in shaping the participation
practices of Wikipedians as well as those who might edit Wikipedia but
choose not to do so. For example, in a recent paper
<http://www.tandfonline.com/doi/abs/10.1080/1369118X.2014.957711#.VSqb244adE4>
(again,
gated link, but I am also happy to provide copies to those who would like
access) that I co-wrote w Eszter Hargittai, we find that web use skills
are, in some ways, even more robust predictors of wikipedia contribution
than gender. There are many other examples of important measures that
predict participation in various ways as well, whether it be individual's
trust/caution attitudes, newcomer experiences, etc. Which of these measures
will you include? How will you ensure that you have included the most
important measures in this survey study since survey results are otherwise
quitre prone to omitted variable bias?
*Missing people and sampling on the dependent variable: *Maybe most
importantly, insofar as you say that you are interested in understanding
factors that determine who edits, you are selecting on the dependent
variable (wikipedia editing) by limiting your study to individuals who have
accounts on the encyclopedia and edit already. It strikes me as especially
egregious that you are requiring survey respondents to read and reply to
the survey recruitment materials via talk page message. This means that
precisely those individuals who participate least (and who would provide
your study with necessary variation on the outcome of interest) are the
least likely to respond and to be included in the study. As a result, I
fear that your findings will not speak to these questions effectively
unless you find an alternative method of sampling and recruitment.
I hope that these comments are helpful for you as you continue to refine
the study design. I really think you're pursuing a critical set of concerns
in this study and I am eager to see it succeed in the most effective way
possible!
yours,
Aaron
On Sun, Apr 12, 2015 at 8:44 AM, Christina Shane-Simpson <
christinam.shane(a)gmail.com> wrote:
> Hello Fellow Wiki Researchers,
>
> I’ve recently posted a project proposal under the Inspire Campaign and
> would love feedback from this community on the research proposal, *Characterization
> of Editors on Wikipedia*:
>
> In order to accurately explore the main goals of the Inspire Campaign,
> we must be able to effectively characterize our community. Any
> interventions that we develop should reflect and match the needs of the
> target population, requiring a thorough understanding of the traits and
> behaviors of our community of editors. As a direct extension of the recent
> gender gap research on Wikipedia and to explore other potential areas of
> inequality, we’d like to conduct another study that compares the traits of
> the super-editor, the active editor (moderate editing), and the inactive
> editor (infrequent edits).
>
> The proposed project would use an online self-report survey that is posted
> on editor talk pages. The research team has experience conducting online
> surveys and will monitor responses on this survey to identify any potential
> misuse of the survey (i.e. vandalism) and/or outliers in the data. This
> entire project would only be implemented after an IRB approval from the
> lead researcher's academic institution.
>
> Full proposal:
> https://meta.wikimedia.org/wiki/Grants:IdeaLab/Characterization_of_Editors_…
> Thank you in advance for your assistance in developing this proposal!
>
> Christina Shane-Simpson
> Psychology Department
> The College of Staten Island and
> The Graduate Center, CUNY
>
>
Hello Fellow Wiki Researchers,
I’ve recently posted a project proposal under the Inspire Campaign and would love feedback from this community on the research proposal, Characterization of Editors on Wikipedia:
In order to accurately explore the main goals of the Inspire Campaign, we must be able to effectively characterize our community. Any interventions that we develop should reflect and match the needs of the target population, requiring a thorough understanding of the traits and behaviors of our community of editors. As a direct extension of the recent gender gap research on Wikipedia and to explore other potential areas of inequality, we’d like to conduct another study that compares the traits of the super-editor, the active editor (moderate editing), and the inactive editor (infrequent edits).
The proposed project would use an online self-report survey that is posted on editor talk pages. The research team has experience conducting online surveys and will monitor responses on this survey to identify any potential misuse of the survey (i.e. vandalism) and/or outliers in the data. This entire project would only be implemented after an IRB approval from the lead researcher's academic institution.
Full proposal: https://meta.wikimedia.org/wiki/Grants:IdeaLab/Characterization_of_Editors_…
Thank you in advance for your assistance in developing this proposal!
Christina Shane-Simpson
Psychology Department
The College of Staten Island and
The Graduate Center, CUNY
Hi Everyone,
I wanted to share two projects currently under consideration for IdeaLab
funding and which may be of direct interest to the Wiki research
community. Note, one purpose of the consciousness-raising repository is to
create a collection of stories for use by researchers studying marginalized
identities on Wikipedia. If you are interested or know someone who is
interested, let me know. If you have feedback for these projects, please
submit them to their discussion pages.
Thanks!
*Consciousness-Raising Repository Call for Working Group Participants*
*Purpose*: We're recruiting a group of diverse Wikipedians to help put
together a repository of stories from users experiencing marginalization on
Wikipedia. The purpose of the repository is to serve as a database of
knowledge about the forms marginalization can take for researchers studying
marginalized identities and as an outlet for users experiencing
marginalization.
*Requirements*: Experience working with marginalization and marginalized
groups. Interest in the Wiki-community. Willing to attend an hour-long
biweekly meeting.
*For more information*:
https://meta.wikimedia.org/wiki/Grants:IdeaLab/A_Consciousness_Raising_Repo…
*Wiki Controversy Monitoring Engine Call for Developers*
*Purpose*: The controversy monitoring engine maintains a real-time rating
of the controversiality of Wikipedia articles by listening to the live
stream of edits from Wikipedia. We need someone who is interested in
building the web interface and interactive visualizations around these
controversies to enable administrators to monitor, investigate, and, if
need be, intervene to deescalate controversies. The goal is to create a
site like stats.wikimedia.org.
*Requirements*: Knowledge of web development, web-based visualization,
and\or data analysis using Wikipedia's API or WikiData.
*For More Information*: see
https://meta.wikimedia.org/wiki/Grants:IdeaLab/Controversy_Monitoring_Engine
--
Jason Radford
Doctoral Student, Sociology, University of Chicago
Visiting Researcher, Lazer Lab, Northeastern University
*Connect*: LinkedIn <http://www.linkedin.com/in/jsradford>, Twitter
<http://www.twitter.com/jsradford>, University of Chicago
<http://home.uchicago.edu/%7Ejsradford/>
*Play Games for Science at Volunteer Science
<http://www.volunteerscience.com>*
Hello fellow wiki-researchers,
I've started working on a research for a seminar paper (which will probably
evolve to an M.A. dissertation) on Wikipedia in Higher Education as part of
my MA in "Technology in Learning" program at the School of Education,
Tel-Aviv University. The research will focus on a elective course on
Wikipedia for Med students, which I've developed and run at TAU.
*I was wondering if any of you have access to this research -- *
Aibar, E., Lerga, M., Lladós, J., Meseguer, A., & Minguillón, J. (2014).
Wikipedia in Higher Education: an Empirical Study RUSC VOL. 11 No 2 Special
Issue| Universitat Oberta de Catalunya and University of New England|
Barcelona.
Other than an abstract, I wasn't able to find it online (nor is it
available through the university library) and I would really appreciate a
copy (word, PDF, whatever you can get your hands on).
Any help getting this article would be much appreciated, as well as any
directions to other relevant existing researches on the topic.
Cheers,
Shani.
Oliver,
Here is one paper on mapping topic coverage in Wikipedia from 2009:
Kittur, A., Chi, E. H., and Suh, B. 2009. What's in Wikipedia?: Mapping
Topics and Conflict using Socially Annotated Category Structure
<http://www-users.cs.umn.edu/~echi/papers/2009-CHI2009/p1509.pdf>. In
Proceedings of the 27th International Conference on Human Factors in
Computing Systems (Boston, MA, USA, April 04 - 09, 2009). CHI '09. ACM, New
York, NY, 1509-1512.
ACM Link <http://doi.acm.org/10.1145/1518701.1518930> (24% acceptance rate)
--Ed (on my tablet)
On Apr 8, 2015 05:01, <wiki-research-l-request(a)lists.wikimedia.org> wrote:
> Send Wiki-research-l mailing list submissions to
> wiki-research-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> or, via email, send a message with subject or body 'help' to
> wiki-research-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> wiki-research-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wiki-research-l digest..."
>
>
> Today's Topics:
>
> 1. Re: Gender-specific page titles (Flöck)
> 2. Re: Research on Wikidata's content coverage (Flöck)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 8 Apr 2015 11:09:00 +0000
> From: Flöck, Fabian <Fabian.Floeck(a)gesis.org>
> To: Research into Wikimedia content and communities
> <wiki-research-l(a)lists.wikimedia.org>
> Subject: Re: [Wiki-research-l] Gender-specific page titles
> Message-ID: <7ECBA035-F388-4F89-9B65-A7C1C956AA2C(a)gesis.org>
> Content-Type: text/plain; charset="utf-8"
>
> Interesting, thanks Mark!
>
> - Fabian
>
> On 07.04.2015, at 16:38, Mark J.Nelson <mjn(a)anadrome.org<mailto:
> mjn(a)anadrome.org>> wrote:
>
>
> Flöck, Fabian <Fabian.Floeck(a)gesis.org<mailto:Fabian.Floeck@gesis.org>>
> writes:
>
> Does anyone know about a study that looks at how often for example
> articles about a profession use the male instead of the female form as
> the name (female form doesn't exist or is just a redirect)?
>
> It would probably not be a so much of an issue for English, but rather
> Spanish, German, Russian etc. Concrete example:
> https://de.wikipedia.org/wiki/Professor exists in German, but
> https://de.wikipedia.org/wiki/Professorin is just a redirect.
>
> One thing to be careful of in such a study (though I would also like to
> see it!) is tha the politics and preferences in this area vary widely
> across languages, and sometimes within a language, so a purely
> data-driven study has to be careful about its assumptions and
> generalizations.
>
> Below a long-ish discussion of Greek that you may skip if not interested
> (it ended up longer-winded than I had expected):
>
> For example in Greek it is very profession-specific whether the trend is
> towards using a slashed form of both genders, or towards convergence on
> a single form that applies to both genders (sometimes with atypical
> morphology). Sometimes it depends on the specific word form and
> historical usage. In fields that historically had both men and women,
> both forms are very well established and tend to persist, e.g. a male
> teacher is a δάσκαλος and a female one is a δασκάλα. But in fields that
> were typically so male-dominated that only the masculine version has
> been in common use, there's disagreement over whether it's more
> progressive to "revive" a feminine form, or to generalize the masculine
> form to cover both genders. For example a female president would
> universally be called by the historically masculine form πρόεδρος,
> but with a feminine article (i.e. πρόεδρος can now be either a masculine
> or feminine noun, depending on context, even though it's morphologically
> irregular as a feminine noun). There is in Byzantine Greek a feminine
> analog, προέδρισσα (referring to a different position), but it isn't
> used today outside humorous contexts (roughly where you might use
> "Presidentess" in English). The same applies for a number of other more
> common professions, but for some it's more disputed which form should be
> used (for President there isn't any usage variance).
>
> In short it's complex, so I hope any data set is careful about what it's
> counting as data, and why. :)
>
> -Mark
>
> --
> Mark J. Nelson
> Anadrome Research
> http://www.anadrome.org
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
>
> Cheers,
> Fabian
>
> --
> Fabian Flöck
> Research Associate
> Computational Social Science department @GESIS
> Unter Sachsenhausen 6-8, 50667 Cologne, Germany
> Tel: + 49 (0) 221-47694-208
> fabian.floeck(a)gesis.org<mailto:fabian.floeck@gesis.org>
>
> www.gesis.org
> www.facebook.com/gesis.org
>
>
>
>
>
>
Hey all,
Is anyone aware of research on the completeness of Wikidata, in terms
of coverage and systemic bias? This seems like the sort of thing Max
Klein might know ;). Papers, blog posts, anything.
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
http://onlinelibrary.wiley.com/doi/10.1111/jcom.12123/abstract
*"What Creates Interactivity in Online News Discussions? An Exploratory
Analysis of Discussion Factors in User Comments on News Items"*
If you have access, and can send me a PDF offline, I would be very grateful
:)
Cheers,
Jonathan
--
Jonathan T. Morgan
Community Research Lead
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
jmorgan(a)wikimedia.org
Hi all, we produced a prototype of an editor-editor interaction network visualization for individual articles, based on the word/tokens deleted and reintroduced by editors. It will be presented as a demo at the WWW conference this year [1], but we would love to also get some feedback on it from this list. It's in an early stage and pretty slow when loading up, so have patience when you try it out here: http://km.aifb.kit.edu/sites/whovis/index.html, and be sure to read the "how to" section on the site. Alternatively you can watch the (semi-professional) screencast I did :P, it explains most of the functions.
The (disagreement) interactions are based on a extended version of the extraction of authorship we do with wikiwho [2], and the graph drawing is done almost exactly after the nice method proposed by Brandes et al. [3] . The code can be found at github, both for the interaction-extraction extension of wikiwho [4] and the visualization itself [5], which basically produces an json output for feeding the D3 visualization libraries we use. We have yet to generate output for more articles, so far we only show a handful for demonstration purposes. The whole thing also fits nicely (and was supposed to go along) with the IEG proposal that Pine had started on editor interaction [6] .
word provenance/authorship API prototype:
Also, we have worked a bit on our early prototype for an API for word provenance/authorship:
You can get word/token-wise information from which revision what content originated (and thereby which editor originally authored the word) at
http://193.175.238.123/wikiwho/wikiwho_api.py?revid=<REV_ID>"&name=<ARTICLENAME>&format=json
(<ARTICLENAME> -> name of the article in ns:0, in the english wikipedia, <REV_ID> -> rev_id of that article for which you want the authorship information, format is currently only json)
Example: http://193.175.238.123/wikiwho/wikiwho_api.py?revid=649876382&name=Laura_Bu…
Output format is currently:
{"tokens": [{"token": "<FIRST TOKEN IN THE WIKI MARKUP TEXT>", "author_name": "<NAME OF AUTHOR OF THE TOKEN>", "rev_id": "<REV_ID WHEN TOKEN WAS FIRST ADDED>"}, {"token": "<SECOND TOKEN IN THE WIKI MARKUP TEXT>", "author_name": "<NAME OF AUTHOR OF THE TOKEN>", "rev_id": "<REV_ID WHEN TOKEN WAS FIRST ADDED>"}, {"token": "<THIRD TOKEN …
… ], "message": null, "success": "true", "revision": {"article": "<NAME OF REQUESTED ARTICLE>", "time": "<TIMESTAMP OF REQUESTED REV_ID>", "reviid": <REQUESTED REV_ID>, "author": "<AUTHOR OF REQUESTED REV_ID>"}}
DISCLAIMER: there are problems with getting/processing the XML for larger articles right now, so don't be surprised if that gives you an error sometimes (i.e. querying "Barack Obama" for instance and similar sizes will *not* succeed for higher revision numbers). Also, we are working on the speed and providing more precomputed articles (right now almost all are computed on request, although we save intermediary results). Still, for most articles it works fine and the output has been tested for accuracy (cf. [2]).
At some point in the future, this API will also be able to deliver the interaction data that the visualization is build on.
I'm looking forward to your feedback :)
Cheers,
Fabian
[1] http://f-squared.org/wikiwho/demo32.pdf
[2] http://f-squared.org/wikiwho/
[3] http://dl.acm.org/citation.cfm?id=1526808
[4] https://github.com/maribelacosta/wikiwho
[5] https://github.com/wikiwho/whovis
[6] https://meta.wikimedia.org/wiki/Grants:IEG/Editor_Interaction_Data_Extracti…
--
Fabian Flöck
Research Associate
Computational Social Science department @GESIS
Unter Sachsenhausen 6-8, 50667 Cologne, Germany
Tel: + 49 (0) 221-47694-208
fabian.floeck(a)gesis.org<mailto:fabian.floeck@gesis.org>
www.gesis.orgwww.facebook.com/gesis.org
[cid:E09117E0-16C9-4BCB-B1F9-D758CB4CE0D3]
Hi Oliver
Den 07-04-2015 kl. 21:50 skrev Oliver Keyes:
> Hey all,
>
> Is anyone aware of research on the completeness of Wikidata, in terms
> of coverage and systemic bias? This seems like the sort of thing Max
> Klein might know ;). Papers, blog posts, anything.
>
"The sum of all human knowledge": A systematic review of scholarly
research on the content of Wikipedia
Journal of the Association for Information Science and Technology,
66(2):219–245, 2015 February.
http://www2.compute.dtu.dk/pubdb/views/edoc_download.php/6784/pdf/imm6784.p…
"Comprehensiveness" starts on page 4
Wikipedia research and tools: Review and comments.
http://www2.compute.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.p…
"Coverage" starts on page 9. See also Table 4.
For example:
"recency was to a certain extent a predictor for coverage"
"Astronomy and banksia somewhat overcited"
best
Finn Årup Nielsen