Hi,
This month's research showcase
<https://www.mediawiki.org/w/index.php?title=Analytics/Research_and_Data/Sho…>
is scheduled for Wednesday, March 25, 11:30 (PST).
We will have two presentations on user session identification by Aaron
Halfaker, and mining missing hyperlinks in Wikipedia by Bob West.
As usual, the event will be recorded and publicly streamed on YouTube
(links will follow). We will hold a discussion and take questions from
remote participants via the Wikimedia Research IRC channel
(#wikimedia-research on freenode).
Looking forward to seeing you there.
Leila
So, we've got a new pageviews definition; it's nicely integrated and
spitting out TRUE/FALSE values on each row with the best of em. But
what does that mean for third-party researchers?
Well...not much, at the moment, because the data isn't being released
somewhere. But one resource we do have that third-parties use a heck
of a lot, is the per-page pageviews dumps on dumps.wikimedia.org.
Due to historical size constrains and decision-making (and by
historical I mean: last decade) these have a number of weirdnesses in
formatting terms; project identification is done using a notation
style not really used anywhere else, mobile/zero/desktop appear on
different lines, and the files are space-separated. I'd like to put
some volunteer time into spitting out dumps in an easier-to-work-with
format, using the new definition, to run in /parallel/ with the
existing logs.
*The new format*
At the moment we have the format:
project_notation - encoded_title - pageviews - bytes
This puts zero and mobile requests to pageX in a different place to
desktop requests, requires some reconstruction of project_notation,
and contains (for some use cases) extraneous information - that being
the byte-count. The files are also headerless, unquoted and
space-separated, which saves space but is sometimes...I think the term
is "eeeeh-inducing".
What I'd like to use as a new format is:
full_project_url - encoded_title - desktop_pageviews - mobile_and_zero_pageviews
This file would:
1. Include a header row;
2. Be formatted as a tab-separated, rather than space-separated, file;
3. Exclude bytecounts;
4. Include desktop and mobile pageview counts on the same line;
5. Use the full project URL ("en.wikivoyage.org") instead of the
pagecounts-specific notation ("en.v")
So, as a made-up example, instead of:
de.m.v Florence 32 9024
de.v Florence 920 7570
we'd end up with:
de.wikivoyage.org Florence 920 32
In the future we could also work to /normalise/ the title - replacing
it with the page title that refers to the actual pageID. This won't
impact legacy files, and is currently blocked on the Apps team, but
should be viable as soon as that blocker goes away.
I've written a script capable of parsing and reformatting the legacy
files, so we should be able to backfill in this new format too, if
that's wanted (see below).
*The size constraints*
There really aren't any. Like I said, the historical rationale for a
lot of these decisions seems to have been keeping the files small. But
by putting requests to the same title from different site versions on
the same line, and dropping byte-count, we save enough space that the
resulting files are approximately the same size as the old ones - or
in many cases, actually smaller.
*What I'm asking for*
Feedback! What do people think of the new format? What would they like
to see that they don't? What don't they need, here? How useful would
normalisation be? How useful would backfilling be?
*What I'm not asking for*
WMF time! Like I said, this is a spare-time project; I've also got
volunteers for Code Review and checking, too (Yuvi and Otto).
The replacement of the old files! Too many people depend on that
format and that definition, and I don't want to make them sad.
Thoughts?
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
Forwarding announcement.
Pine
---------- Forwarded message ----------
From: "Manprit Brar" <mbrar(a)wikimedia.org>
Date: Mar 18, 2015 1:19 PM
Subject: [Wikimedia Announcements] Announcing Wikimedia Foundation's New
Open Access Policy
To: <wikimediaannounce-l(a)lists.wikimedia.org>
Cc:
Hi all,
We're proud to announce that the Wikimedia Foundation today joins the
growing ranks of major institutions with open access policies. Our new Open
Access Policy <https://wikimediafoundation.org/wiki/Open_access_policy>[1]
will ensure that all research the Wikimedia Foundation supports through
grants, equipment, or research collaboration is made widely accessible and
reusable. Research, data, and code developed through these collaborations
will be made available in open access venues and under a free license
<http://freedomdefined.org/>[2] in keeping with the Wikimedia Foundation’s
mission to support free knowledge. You can read more about this effort in
today's blog post
<https://blog.wikimedia.org/2015/03/18/wikimedia-open-access-policy/>[3].
[1] https://wikimediafoundation.org/wiki/Open_access_policy
[2] http://freedomdefined.org/Definition
[3] https://blog.wikimedia.org/2015/03/18/wikimedia-open-access-policy/
Manprit Brar
Legal Counsel
*Wikimedia Foundation*
*NOTICE*: This message may be confidential or legally privileged. If you
have received it by accident, please delete it and let us know about the
mistake. As an attorney for the Wikimedia Foundation, for legal/ethical
reasons I cannot give legal advice to, or serve as a lawyer for, community
members, volunteers, or staff members in their personal capacity. For more
on what this means, please see our legal disclaimer
<https://meta.wikimedia.org/wiki/Wikimedia_Legal_Disclaimer>.
_______________________________________________
Please note: all replies sent to this mailing list will be immediately
directed to Wikimedia-l, the public mailing list of the Wikimedia
community. For more information about Wikimedia-l:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
_______________________________________________
WikimediaAnnounce-l mailing list
WikimediaAnnounce-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimediaannounce-l
Dear all,
Recently I have gathered and plotted the world Internet users from 2000
onwords based on three different world geographic categorization schemes:
UN, World Bank, and CIA
http://people.oii.ox.ac.uk/hanteng/2015/03/18/the-new-internet-world-un-wor…
In this blog post, I have noticed that the Wikimedia Foundation's
geographic categorization seems to be different from the Stats page and the
International development team, signaling out the distinct absence/presence
of the category of Middle East and North Africa (MENA), a category that is
used only in the World Bank categorization scheme.
Personally I do not have any preferences of any of the categorization
schemes, I believe in open data, open research, and open solutions. So it
is up to researchers and practitioners to decide which categorization
schemes to use.
Nonetheless, it points to a practice gap in the current research into
" global Wikimedia content and communities". Thus I raise a few questions
in the blog post as below in terms how we can systematically compare and
share the Wikimedia's global activities in editing, viewing, fundraising
and fund dissemination:
The above charts no only show the important baselines for any Websites
(assuming that Internet users are also Web users) for their global
strategies for growth, but also demonstrate the importance of the choice of
geographic categorization scheme. .....
...., it is relevant and important if the Wikimedia Foundation can release
its global activities, including the viewing, editing and fundraising
numbers so that researchers and Wikipedians can compare those numbers with
the world distribution of Internet users. It would be interesting to see,
for example, how the funds are raised from around the world and then
distributed across different geographic regions, per Internet user.
To me, I believe that data aggregation reports to the level of any of
the three geographic categorization schemes (CIA, worldbank, and UN) should
have much less privacy concerns and thus should be reported regularly, if
not monthly, at least annually.
Feel free to forward it to another mailing list of the larger Wikimedia
family if you found it relevant.
Best,
han-teng liao
Please see below for details. Note that there are several research tracks
including a Wikipedia Research track.
OPENSYM 2015, THE 11TH INTERNATIONAL SYMPOSIUM ON OPEN COLLABORATION
August 19-21, 2015 | San Francisco, California, U.S.A.
http://opensym.org/os2015 | ACM SIWEB and ACM SIGSOFT supported
ABOUT THE CONFERENCE
The 11th International Symposium on Open Collaboration (OpenSym 2015) is
the premier conference on open collaboration research and practice,
including free/libre/open source software, open data, IT-driven open
innovation research, wikis and related open collaborative media, and
Wikipedia and related Wikimedia projects.
OpenSym brings together the different strands of open collaboration
research and practice, seeking to create synergies and inspire new
collaborations between computer science and information systems
researchers, social scientists, legal scholars, and everyone interested
in understanding open collaboration and how it is changing the world.
OpenSym 2015 will be held in San Francisco, California, on *August
19-21, 2015*.
This is the general call for papers and includes the
- research track call for submissions,
- industry and community track call for submissions, and
- doctoral symposium call for submissions.
OpenSym is held in-cooperation with ACM SIGWEB and ACM SIGSOFT. As in
previous years, the conference proceedings will be archived in the ACM
digital library.
RESEARCH TRACK CALL FOR SUBMISSIONS
The conference provides the following peer-reviewed research tracks.
- Free/libre/open source software research, chaired by Carlos Jensen of
Oregon State University and Gregorio Robles of Universidad Rey Juan
Carlos. This track seeks papers on all aspects of FLOSS. For detailed
topics and the research track committee please see http://wp.me/Pezfy-IU.
- IT-driven open innovation research, chaired by Ann Majchrzak of
University of Southern California and Arvind Malhotra of University of
North Carolina at Chapel Hill. This track is devoted to research on the
process of expanding research and development activities beyond the
boundaries of single company structures. For detailed topics and the
research track committee please see http://wp.me/Pezfy-J3.
- Open data research, chaired by Carl Lagoze of University of Michigan.
This track contributes to the increasing awareness on Open Data in
research. For detailed topics and the research track committee please
see http://wp.me/Pezfy-J5.
- Wikis and open collaboration research, chaired by Kevin Crowston of
Syracuse University. This track is dedicated to the science and
application of wikis and open collaboration technology outside of the
context of Wikipedia. For detailed topics and the research track
committee please see http://wp.me/Pezfy-J7.
- Wikipedia and related projects research, chaired by Claudia
Müller-Birn of Freie Universität Berlin and Aaron Shaw of Northwestern
University. This track addresses research specifically on Wikipedia and
associated projects. For detailed topics and the research track
committee please see http://wp.me/Pezfy-J9.
Research papers present integrative reviews or original reports of
substantive new work: theoretical, empirical, and/or in the design,
development and/or deployment of novel concepts, systems, and
mechanisms. Research papers will be reviewed by a research track program
committee to meet rigorous academic standards of publication. Papers
will be reviewed for relevance, conceptual quality, innovation and
clarity of presentation.
Authors can submit full papers (5-10 pages), short papers (2-4 pages),
and research posters (1-2 pages). For more details on paper types please
see http://wp.me/Pezfy-Je.
Submission deadline for all research contributions is *March 29th, 2015*.
Authors submit through EasyChair at
https://easychair.org/conferences/?conf=opensym2015. Submissions and
final contributions must follow the ACM SIG Proceedings template found at
http://www.acm.org/sigs/publications/proceedings-templates.
OpenSym seeks to accommodate the needs of the different research
disciplines it draws on. Authors whose submissions have been accepted
for presentation at the conference have a choice of having
- their paper become part of the official proceedings, archived in the
ACM Digital Library,
or having
- only a short abstract included in the proceedings (rather than the
full submitted paper) in order to preserve future publication possibilities.
DOCTORAL SYMPOSIUM CALL FOR SUBMISSIONS
OpenSym seeks to explore the synergies between all strands of open
collaboration research. Thus, we will have a doctoral symposium, in
which Ph.D. students from different disciplines can present their work
and receive feedback from senior faculty and their peers.
Submission deadline for doctoral symposium position papers is *May 3rd,
2015*.
Authors submit through EasyChair at
https://easychair.org/conferences/?conf=opensym2015. Submissions and
final contributions must follow the ACM SIG Proceedings template found
at http://www.acm.org/sigs/publications/proceedings-templates.
More information is available at http://wp.me/Pezfy-Jh.
INDUSTRY AND COMMUNITY TRACK CALL FOR SUBMISSIONS
OpenSym is also seeking submissions for experience reports (full and
short), tutorials, workshops, panels, non-research posters, and demos.
Such work accepted for presentation or performance at the conference is
considered part of the industry and community track. It will be put into
the proceedings in an industry and community track section; authors can
opt-out of the publication, as with research papers, but will still have
to provide an abstract (less than one page) for the proceedings.
Submission deadline for industry and community track papers is *April
19, 2015*.
Authors submit through EasyChair at
https://easychair.org/conferences/?conf=opensym2015. Submissions and final
contributions must follow the ACM SIG Proceedings template found at
http://www.acm.org/sigs/publications/proceedings-templates.
More information is available at http://wp.me/Pezfy-Jh.
THE OPENSYM CONFERENCE EXPERIENCE
OpenSym 2015 will be held in San Francisco, California, on August 19-21,
2015.
Research, industry, and community presentations and performances will be
accompanied by keynotes, invited speakers, and a social program in one
of the most vibrant cities on this planet.
The open space track is a key ingredient of the event that distinguishes
OpenSym from other conferences. It is an integral part of the program
that makes it easy to talk to other researchers and practitioners and to
stretch your imagination and conversations beyond the limits of your own
sub-discipline, exposing you to the full breadth of open collaboration
research. The open space track is entirely participant-organized, is
open for everyone, and requires no submission or review.
The general chair of the conference is Dirk Riehle of
Friedrich-Alexander University Erlangen-Nürnberg. Feel free to contact
us with any questions you might have at info(a)opensym.org.
_______________________________________________
Dear all,
I have some findings to show the page views per Internet user
measurement may help comparing different language editions of Wikipedia.
Criticism and suggestions are welcome.
-----
http://people.oii.ox.ac.uk/hanteng/2015/03/15/comparing-language-developmen…
Which language version of Wikipedia enjoys the most page views per language
Internet user than expected? It is Finnish. In terms of absolute positive
and negative gap, English has the widest positive gap whereas Chinese has
the largest negative gap.
......
In particular, it is known that Wikipedia (and Google which often favours
Wikipedia) faces local competition in the People's Republic of China and
South Korea. Therefore it is understandable the page views may be lower in
Chinese and Korean Wikipedia language projects simply because some users'
need to read user-generated encyclopedias are satisfied by other websites.
However, it remains an important question to examine why these particular
Latin and Asian languages are under-developed for Wikipedia projects.
FYI, people looking at reverts logs may be interested in this satellite
event at ICWSM'15
Cheers,
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA
☞ http://www.glciampaglia.com/
✆ +1 812 855-7261
✉ gciampag(a)indiana.edu
---------- Forwarded message ----------
From: Nicola Perra <nicolaperra(a)gmail.com>
Date: 2015-03-11 10:00 GMT-04:00
*Modeling and Mining Temporal Interactions (M2TI)*
*ICWSM'15 workshop.*
*Oxford, UK, May 26, 2015*
*Webpage:* http://m2ti.weebly.com/
The emergence of the Big Data paradigm together with the framework of
complex networks has played a crucial role in providing the tools and
datasets to begin understanding human interactions and the dynamics of
social systems with wide applications to informatics, social sciences,
information technology, and epidemiology.
The rise of the social web has resulted in the creation of an unprecedented
number of social systems than can be analyzed in great detail. This unique
opportunity has fostered an interdisciplinary effort to study their
structure and dynamics with network theory taking the forefront. As a
result, researchers have unveiled a number of surprising properties about
the structure and dynamics of large-scale social systems such as online
social networks, scientific and OSS collaboration networks, or mobile phone
communication networks.
Due to both theoretical and practical difficulties, until recently, the
majority of studies have focused on static representations of the social
interactions, where the underlying network structure does not change over
time. While this assumption has proven to be useful and allowed for a great
progress recently, its limitations are now becoming clear. Static
representations miss the timing, duration, and concurrency of social
interactions, which are crucial to characterize many processes such as the
spread of rumors, information and epidemics, the emergence of social norms
and memes or even online congestion, resource depletion and mutual
influence, among others. In general, neglecting the network dynamics might
lead to mischaracterizations of the dynamical processes.
Today, thanks to the ubiquity of online services, social media, GPS enabled
smartphones and the rise of wearable computers, rich time-resolved datasets
of both user behaviors and interactions are available. Empirical data on
human interaction dynamics now include user access logs, email
communications, chatting, phone call and SMS metadata, online forum
discussions, collaborations in software projects, movie participations,
geo-located individual mobility datasets, location check in, and many
others. The availability of such data is triggering a new wave of
data-driven studies of temporal properties of human behavior,
communication, and social interactions.
The mining and study of the temporal characteristics of social dynamics,
raises new fundamental challenges, both theoretical and computational, with
applications to fields such as social sciences, marketing, computer
science, and epidemiology. In particular, new tools and frameworks are
needed to mine, characterize and model temporal behaviors as well as the
profound consequences that individual characteristics and behavioral
correlations have on processes under study.
*Topics and Themes*
The list of topics that we aim to cover at the workshop is the following:
- Data mining approaches for time-resolved datasets
- Representation of time-resolved datasets
- Dynamic interplay between social temporal networks and human behavior
- Analysis and characterization of temporal networks
- Modeling of temporal networks
- Behavioral pattern mining
- User modeling and analysis of behavioral patterns
- Processes on temporal networks
- Data driven models of collaboration
- Influence models
- User and product recommendation based on temporal patterns
- Spatio-temporal correlation between interactions and service usage
The meeting will be one day long and will host around 15 contributed talks.
Submissions will be evaluated and selected by the Program Committee
members, based on the adherence with the theme of the satellite,
originality and scientific soundness.
Abstracts are submitted via the EasyChair website:
https://easychair.org/conferences/?conf=m2ti
Authors who do not have an EasyChair account should sign up for an account
(for identification purposes, make sure to use the same email address as
the one used for the conference registration).
We solicit research papers (up to 8 pages) and extended abstracts (max 2
pages with one illustration/figure).
Extended abstracts will not be included in the proceedings of the conference
Once the selection process is complete, the authors of the accepted
abstracts will be notified by e-mail
Papers submission: *March 13, 2015*
Acceptance notification: March 27, 2015
Camera-ready paper due: April 3, 2015
Workshop Day: May 26, 2015
*Invited Speakers*:
- Bruno Ribeiro, Carnegie Mellon University
- Suzy Moat, Warwick University
*Organizers*:
- Bruno Goncalves
- Marton Karsai
- Nicola Perra
Hi all, we produced a prototype of an editor-editor interaction network visualization for individual articles, based on the word/tokens deleted and reintroduced by editors. It will be presented as a demo at the WWW conference this year [1], but we would love to also get some feedback on it from this list. It's in an early stage and pretty slow when loading up, so have patience when you try it out here: http://km.aifb.kit.edu/sites/whovis/index.html, and be sure to read the "how to" section on the site. Alternatively you can watch the (semi-professional) screencast I did :P, it explains most of the functions.
The (disagreement) interactions are based on a extended version of the extraction of authorship we do with wikiwho [2], and the graph drawing is done almost exactly after the nice method proposed by Brandes et al. [3] . The code can be found at github, both for the interaction-extraction extension of wikiwho [4] and the visualization itself [5], which basically produces an json output for feeding the D3 visualization libraries we use. We have yet to generate output for more articles, so far we only show a handful for demonstration purposes. The whole thing also fits nicely (and was supposed to go along) with the IEG proposal that Pine had started on editor interaction [6] .
word provenance/authorship API prototype:
Also, we have worked a bit on our early prototype for an API for word provenance/authorship:
You can get word/token-wise information from which revision what content originated (and thereby which editor originally authored the word) at
http://193.175.238.123/wikiwho/wikiwho_api.py?revid=<REV_ID>"&name=<ARTICLENAME>&format=json
(<ARTICLENAME> -> name of the article in ns:0, in the english wikipedia, <REV_ID> -> rev_id of that article for which you want the authorship information, format is currently only json)
Example: http://193.175.238.123/wikiwho/wikiwho_api.py?revid=649876382&name=Laura_Bu…
Output format is currently:
{"tokens": [{"token": "<FIRST TOKEN IN THE WIKI MARKUP TEXT>", "author_name": "<NAME OF AUTHOR OF THE TOKEN>", "rev_id": "<REV_ID WHEN TOKEN WAS FIRST ADDED>"}, {"token": "<SECOND TOKEN IN THE WIKI MARKUP TEXT>", "author_name": "<NAME OF AUTHOR OF THE TOKEN>", "rev_id": "<REV_ID WHEN TOKEN WAS FIRST ADDED>"}, {"token": "<THIRD TOKEN …
… ], "message": null, "success": "true", "revision": {"article": "<NAME OF REQUESTED ARTICLE>", "time": "<TIMESTAMP OF REQUESTED REV_ID>", "reviid": <REQUESTED REV_ID>, "author": "<AUTHOR OF REQUESTED REV_ID>"}}
DISCLAIMER: there are problems with getting/processing the XML for larger articles right now, so don't be surprised if that gives you an error sometimes (i.e. querying "Barack Obama" for instance and similar sizes will *not* succeed for higher revision numbers). Also, we are working on the speed and providing more precomputed articles (right now almost all are computed on request, although we save intermediary results). Still, for most articles it works fine and the output has been tested for accuracy (cf. [2]).
At some point in the future, this API will also be able to deliver the interaction data that the visualization is build on.
I'm looking forward to your feedback :)
Cheers,
Fabian
[1] http://f-squared.org/wikiwho/demo32.pdf
[2] http://f-squared.org/wikiwho/
[3] http://dl.acm.org/citation.cfm?id=1526808
[4] https://github.com/maribelacosta/wikiwho
[5] https://github.com/wikiwho/whovis
[6] https://meta.wikimedia.org/wiki/Grants:IEG/Editor_Interaction_Data_Extracti…
--
Fabian Flöck
Research Associate
Computational Social Science department @GESIS
Unter Sachsenhausen 6-8, 50667 Cologne, Germany
Tel: + 49 (0) 221-47694-208
fabian.floeck(a)gesis.org<mailto:fabian.floeck@gesis.org>
www.gesis.org<http://www.gesis.org/>
www.facebook.com/gesis.org<http://www.facebook.com/gesis.org>
On Mon, Feb 16, 2015 at 10:41 PM, koltzenburg(a)w4w.net <koltzenburg(a)w4w.net>
wrote:
> ____Aaron wrote:
> "higher quality survey data"
> well, and how does one recognize low quality and how come it is so low?
> and "quality" by whose epistemological aims and standards?
>
> "causes and mechanisms that drive the gender gap (and related
> participation gaps)"
> which "related participation gaps" do you have in mind here?
>
Jane's response was helpful and similar to mine.
Based on existing surveys, there are demographic and social categories of
people who are underrepresented among current editors. I don't have
specifics off the top of my head, but if you look at WMF survey results for
US editors and compare the findings to US census data (for example), you
can get an idea of some categories. Women are underrepresented to an
extreme degree, but they are not the only population that does not seem to
edit en:WP. I am less knowledgeable about other WPs, but I suspect there
are other inequalities and gaps on other wikis.
> where would these gaps be situated in terms of areas of participation?
>
See above.
> and, again, in which language version(s)?
>
See above.
On Mon, Feb 16, 2015 at 11:38 PM, Federico Leva (Nemo) <nemowiki(a)gmail.com>
wrote:
>
> Speaking of which, the WMF doesn't have resources to appropriately
> process the 2012 survey data, so results aren't available yet. Did you
> consider offering them to take care of it, at least for the gendergap
> number? You would then be able to publish an update.
>
> https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_Editor_Survey_2012#…
As before, my understanding is that the method by which respondents were
selected to participate in the survey does not meet standard methods of
survey sampling (see this chunk
<https://meta.wikimedia.org/wiki/Research:Wikipedia_Editor_Survey_2012#When.…>
of
the description of the survey). As a result, I do not trust the results of
the 2012 survey to generate precise estimates of the gender gap or other
demographic details about participation. I've spoken to some very receptive
folks at the foundation about this and I hope that they/we will be able to
improve it in the future. I'm eager to help improve the survey data
collection procedures. Unfortunately, I do not have the capacity to analyze
the current survey data in greater depth.
The thing that allowed Mako and I to do the study that we published in
PLOSONE was the fact that (1) the old UNU-Merit & WMF survey sought to
include readers as well as editors; *and* (2) at the exact same time Pew
carried out a survey in which they asked a nearly identical question about
readership. We used the overlapping results about WP readership from both
surveys to generate a correction for the data about editorship. Without
similar data on readership and similar data from a representative sample of
some reference population (in the case of the pew survey, US adults), we
cannot perform the same correction. As a result, I do not feel comfortable
estimating how biased (or unbiased) the 2012 survey results may be.
a
On Tue, Feb 17, 2015 at 12:00 AM, Jane Darnell <jane023(a)gmail.com> wrote:
> Hi Claudial,
> I responded to your questions in the text - hope it's readable.
> Jane
>
> ____WereSpielChequers wrote:
> "the community is more abrasive towards women"
>
> I think he is simply referring to earlier discussions where the
> conclusion was "the community can be perceived to be abrasive" and this
> conclusion, in yet other discussions led to this conclusion, which should
> be rephrased as "the community is more often perceived as abrasive by women
> than by men"
>
> ____Kerry wrote:
> "But I would agree that if an organisation sets a target (25% women in this
> particular case) and then does not put in place a means of measuring the
> progress against that target, one has to question the point of
> establishing a
> target."
>
> ___Claudia (responding to Kerry):
> I think one has to question the point of not putting in place a means of
> measuring the progress...
> and also ask why, if the issue is a high priority (allegedly, one might
> add, in
> speeches at meetings, in interviews with the press...) this organisation
> does
> not fund any top level research... - or does it?
>
> I think here you are forgetting about the "holy shit graph" which shows
> a reduction in the number of active editors over time. This is much more of
> a direct threat to the Wikiverse than the gendergap, which, as has been
> stated before, is only one of many serious gaps in knowledge coverage.
> Oddly, I think it is one of the easiest of all "participatory gaps" to
> measure, but we seem to constantly get stranded in objections to ways that
> previous editor surveys have been held, leading to the strange situation of
> never actually being able to run even one editor survey twice. Since we
> have not yet been able to establish any trend at all, we are only comparing
> apples to oranges.
>
> ____Aaron wrote:
> "higher quality survey data"
> __Claudia (responding to Aaron): ...how does one recognize low quality..?
> Hmm. I just looked and I couldn't find the criticism of the various editor
> surveys. Is this stashed somewhere on meta? Or do we need to sift through
> reams of emails until we find all the various objections? Objections
> galore, as I recall.
>
> ___Claudia: which "related participation gaps" do you have in mind here?
> Off the top of my head, some of these would be
>
> 1) lack of geographical editor coverage such as active editors in rural
> areas or even in whole states such as Wyoming or South Dakota and the whole
> "Global South participation problem" (the Global South participation
> problem is even helped along inadvertently by the new read-only
> "Wikipedia-zero" effect);
> 2) lack of topical expertise on subjects that technically don't lend
> themselves well to the Wikiverse, such as auditory fields (musical
> production) or visual fields (how to paint, how to make movies, how to
> choreograph motion)
> 3) lack of topical expertise on subjects that legally don't lend
> themselves well to the Wikiverse, such as articles about artworks under
> copyright that cannot be illustrated in an article;
> 4) lack of topical editor coverage on subjects previously shut out - there
> is still unwillingness by a whole group to re-enter the Wikiverse after
> being banned (earlier shut-outs such as blocking whole institution-wide ip
> ranges for vandalism or whole areas of expertise such as groups of writers
> for their COI editing, carry with them a history of anti-Wikipedia
> sentiment that lasts a long time in various enclaves)
>
> ___Claudia:
> and, again, in which language version(s)?
> That's easy - the languages that we can technically support but don't yet
> have Wikipedias for and the languages for which we don't even have the
> fonts to display them.
>
> best,
> Claudia
>
>
> On Tue, Feb 17, 2015 at 7:41 AM, <koltzenburg(a)w4w.net> wrote:
>
>> Hi WereSpielChequers, Kerry, Aaron and all,
>>
>> ____WereSpielChequers wrote:
>> "the community is more abrasive towards women"
>>
>> this may be stats expert discourse, but let me show you how the question
>> itself has a gendered slant.
>> imagine what would happen - also in your research design - if it read:
>> "the
>> community is less abrasive towards men" - how does this compare to the
>> first question re who are "the community"?
>>
>> and again, re phasing ten years in 2011 and four years on, which language
>> version(s) are hypotheses based on?
>>
>> ____Kerry wrote:
>> "But I would agree that if an organisation sets a target (25% women in
>> this
>> particular case) and then does not put in place a means of measuring the
>> progress against that target, one has to question the point of
>> establishing a
>> target."
>>
>> I think one has to question the point of not putting in place a means of
>> measuring the progress...
>> and also ask why, if the issue is a high priority (allegedly, one might
>> add, in
>> speeches at meetings, in interviews with the press...) this organisation
>> does
>> not fund any top level research... - or does it?
>>
>> ____Aaron wrote:
>> "higher quality survey data"
>> well, and how does one recognize low quality and how come it is so low?
>> and "quality" by whose epistemological aims and standards?
>>
>> "causes and mechanisms that drive the gender gap (and related
>> participation gaps)"
>> which "related participation gaps" do you have in mind here?
>> where would these gaps be situated in terms of areas of participation?
>> and, again, in which language version(s)?
>>
>> best,
>> Claudia
>>
>> ---------- Original Message -----------
>> From:aaron shaw <aaronshaw(a)northwestern.edu>
>> To:Research into Wikimedia content and communities <wiki-research-
>> l(a)lists.wikimedia.org>
>> Sent:Mon, 16 Feb 2015 20:50:17 -0800
>> Subject:Re: [Wiki-research-l] a cautious note on gender stats Re: Fwd:
>> [Gendergap] Wikipedia readers
>>
>> > Hi all!
>> >
>> > Thanks, Jeremy & Dariusz for following up.
>> >
>> > On Mon, Feb 16, 2015 at 5:58 AM, Dariusz
>> > Jemielniak <darekj(a)alk.edu.pl> wrote:
>> >
>> > > As far as I recall, they did a follow-up on this topic, and maybe a
>> > > publication coming up?
>> >
>> > Sadly, no follow ups at the moment.
>> >
>> > If we want to have a more precise sense of the
>> > demographics of participants the biggest need in
>> > this space is simply higher quality survey data.
>> > My paper with Mako has a lot of detail about why
>> > the 2008 editor survey (and all subsequent editor
>> > surveys, to my knowledge) has some profound limitations.
>> >
>> > The identification and estimation of the effects
>> > of particular causes and mechanisms that drive the
>> > gender gap (and related participation gaps)
>> > presents an even tougher challenge for
>> > researchers and is an area of active inquiry.
>> >
>> > all the best,
>> > Aaron
>> ------- End of Original Message -------
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l(a)lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>
>