Pursuant to prior discussions about the need for a research
policy on Wikipedia, WikiProject Research is drafting a
policy regarding the recruitment of Wikipedia users to
participate in studies.
At this time, we have a proposed policy, and an accompanying
group that would facilitate recruitment of subjects in much
the same way that the Bot Approvals Group approves bots.
The policy proposal can be found at:
http://en.wikipedia.org/wiki/Wikipedia:Research
The Subject Recruitment Approvals Group mentioned in the proposal
is being described at:
http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group
Before we move forward with seeking approval from the Wikipedia
community, we would like additional input about the proposal,
and would welcome additional help improving it.
Also, please consider participating in WikiProject Research at:
http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research
--
Bryan Song
GroupLens Research
University of Minnesota
Hi all,
For all Hive users using stat1002/1004, you might have seen a deprecation
warning when you launch the hive client - that claims it's being replaced
with Beeline. The Beeline shell has always been available to use, but it
required supplying a database connection string every time, which was
pretty annoying. We now have a wrapper
<https://github.com/wikimedia/operations-puppet/blob/production/modules/role…>
script
setup to make this easier. The old Hive CLI will continue to exist, but we
encourage moving over to Beeline. You can use it by logging into the
stat1002/1004 boxes as usual, and launching `beeline`.
There is some documentation on this here:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Beeline.
If you run into any issues using this interface, please ping us on the
Analytics list or #wikimedia-analytics or file a bug on Phabricator
<http://phabricator.wikimedia.org/tag/analytics>.
(If you are wondering stat1004 whaaat - there should be an announcement
coming up about it soon!)
Best,
--Madhu :)
We’re glad to announce the release of an aggregate clickstream dataset extracted from English Wikipedia
http://dx.doi.org/10.6084/m9.figshare.1305770 <http://dx.doi.org/10.6084/m9.figshare.1305770>
This dataset contains counts of (referer, article) pairs aggregated from the HTTP request logs of English Wikipedia. This snapshot captures 22 million (referer, article) pairs from a total of 4 billion requests collected during the month of January 2015.
This data can be used for various purposes:
• determining the most frequent links people click on for a given article
• determining the most common links people followed to an article
• determining how much of the total traffic to an article clicked on a link in that article
• generating a Markov chain over English Wikipedia
We created a page on Meta for feedback and discussion about this release: https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_clickstream <https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_clickstream>
Ellery and Dario
This was in the recent Research Newsletter:
https://www.econstor.eu/bitstream/10419/127472/1/847290360.pdf
They found a correlation between the length of articles about tourist
destinations and the number of tourists visiting them. They tried to
influence other destinations by adding content and did not find a
correlation in the subsequent number of tourists, suggesting that the
causation flows from tourism to article length instead.
But I was taken aback by the last line of their paper, "using the
suggested research design to study other areas of information
acquisition, such as medicine or school choices could be fruitful
directions."
Are there any ethical guidelines concerning whether this is
reasonable? Should there be?
I'd like to highlight two videos (some people may have already seen these)
that demo upcoming changes to edit review / RC patrol that take advantage
of ORES. I feel that that the changes look promising, and I hope that RC
patrollers, Teahouse hosts, newbie adopters, and others will find that the
changes make their work easier. I also hope for improved retention of
good-faith contributors.
0. A succinct overview by Joe Matazzoni (WMF):
https://commons.wikimedia.org/w/index.php?title=File%3ANew-feature_demo%E2%…
1. A more extensive overview, also by Joe, including valuable context, from
the WMF Metrics Meeting for May 2017:
https://www.youtube.com/watch?v=rAGwQdLyFb4 between 15:00 and 28:15.
Pine
Hey everybody,
TL;DR: I wanted to let you know about an upcoming experimental Reddit AMA
("ask me anything") chat we have planned. It will focus on artificial
intelligence on Wikipedia and how we're working to counteract vandalism
while also making life better for newcomers.
We plan to hold this chat on June 1st at 21:00 UTC/14:00 PST in the /r/iAMA
subreddit[1]. I'd love to answer any questions you have about these topics
questions, and I'll send a follow-up email to this thread shortly before
the AMA begins.
----
For those who don't know who I am, I create artificial intelligences[2]
that support the volunteers who edit Wikipedia[3]. I've been fascinated by
the ways that crowds of volunteers build massive, high quality information
resources like Wikipedia for over ten years.
For more background, I research and then design technologies that make it
easier to spot vandalism in Wikipedia—which helps support the hundreds of
thousands of editors who make productive contributions. I also think a lot
about the dynamics between communities and new users—and ways to make
communities inviting and welcoming to both long-time community members and
newcomers who may not be aware of community norms. For a quick sampling of
my work, check out my most impactful research paper about Wikipedia[3],
some recent coverage of my work from *Wired*[4], or check out the master
list of my projects on my WMF staff user page[5], the documentation for the
technology team I run[9], or the home page for Wikimedia Research[8].
This AMA, which I'm doing with with the Foundation's Communications
department, is somewhat of an experiment. The intended audience for this
chat is people who might not currently be a part of our community but have
questions about the way we work—as well as potential research collaborators
who might want to work with our data or tools. Many may be familiar with
Wikipedia but not the work we do as a community behind the scenes.
I'll be talking about the work I'm doing with the ethics of AI and how we
think about artificial intelligence on Wikipedia, and ways we’re working to
counteract vandalism on the world’s largest crowdsourced source of
knowledge—like the ORES extension[6], which you may have seen highlighting
possibly problematic edits on your watchlist and in RecentChanges.
I’d love for you to join this chat and ask questions. If you do not or
prefer not to use Reddit, we will also be taking questions on ORES'
MediaWiki talk page[7] and posting answers to both threads.
1. https://www.reddit.com/r/IAmA/
2. https://en.wikipedia.org/wiki/Artificial_intelligence
2. https://www.mediawiki.org/wiki/ORES
3.
http://www-users.cs.umn.edu/~halfak/publications/The_Rise_and_Decline/halfa…
4.
https://www.wired.com/2015/12/wikipedia-is-using-ai-to-expand-the-ranks-of-…
5. https://en.wikipedia.org/wiki/User:Halfak_(WMF)
6. https://www.mediawiki.org/wiki/Extension:ORES
7. https://www.mediawiki.org/wiki/Talk:ORES
8. https://www.mediawiki.org/wiki/Wikimedia_Research
9. https://www.mediawiki.org/wiki/Wikimedia_Scoring_Platform_team
-Aaron
Principal Research Scientist @ WMF
User:EpochFail / User:Halfak (WMF)
Hi all,
We are preparing to conduct some research into the process of how Requests
for Comments (RfCs) get discussed and closed. This work is further
described in the following Wikimedia page: https://meta.wikimedia.o
rg/wiki/Research:Discussion_summarization_and_decision_support_with_Wikum
To begin, we are planning to do a round of interviews with people who
participate in RfCs in English Wikipedia, including frequent closers,
infrequent closers, and people who participate in but don't close RfCs. We
will be asking them about how they go about closing RfCs and their opinions
on how the overall process could be improved. We are also creating a
database of all the RfCs on English Wikipedia that have gone through a
formal closure process and parsing their conversations.
While planning the interviews, we thought that the information that we
gather could be of interest to the Wikimedia community, so we wanted to
open it up and ask if there was anything you would be interested in
learning about RfCs or RfC closure from people who participate in them.
Also, if you know of existing work in this area, please let us know.
Thank you!
Amy
--
Amy X. Zhang | Ph.D. student at MIT CSAIL | http://people.csail.mit.edu/axz
| @amyxzh
Hello all,
I recently attended the 2017 Conference on Human Factors in Computing Systems (CHI) and put together a small report/reflection for Aaron Halfaker regarding some of the work that was presented there that I found interesting. If you’d like to check the report out, it can be found here: https://meta.wikimedia.org/wiki/User:Hall1467/CHI_2017_Report <https://meta.wikimedia.org/wiki/User:Hall1467/CHI_2017_Report>. CHI is a yearly human-computer interaction conference and is a common venue for studies on peer production communities such as Wikipedia.
Feel free to leave questions or comments in the talk page! Have a great rest of the week.
Andrew
Call for Submissions to the Doctoral Programme
10th Conference on Intelligent Computer Mathematics
- CICM 2017 -
July 17-21, 2017
University of Edinburgh, Scotland
http://www.cicm-conference.org/2017
----------------------------------------------------------------------
* Submission deadline (Abstract + CV): 5. June 2017
* Notification of acceptance: 8. June 2017
Further information see below
NEWS: Invited Speakers at CICM 2017
- Alan Bundy (University of Edinburgh)
- Przemysław Chojecki (Polish Academy of Sciences)
- Grant Olney Passmore (University of Cambridge)
----------------------------------------------------------------------
Digital and computational solutions are becoming the prevalent means
for the generation, communication, processing, storage and curation of
mathematical information. Separate communities have developed to
investigate and build computer based systems for computer algebra,
automated deduction, and mathematical publishing as well as novel user
interfaces. While all of these systems excel in their own right, their
integration can lead to synergies offering significant added
value. The Conference on Intelligent Computer Mathematics (CICM)
offers a venue for discussing and developing solutions to the great
challenges posed by the integration of these diverse areas.
CICM has been held annually as a joint meeting since 2008, co-locating
related conferences and workshops to advance work in these
subjects. Previous meetings have been held in Birmingham (UK 2008),
Grand Bend (Canada 2009), Paris (France 2010), Bertinoro (Italy 2011),
Bremen (Germany 2012), Bath (UK 2013), Coimbra (Portugal 2014),
Washington DC (USA 2015) and Bialystok (Poland 2016).
This is a call for papers for CICM 2017, which will be held in
Edinburgh, Scotland, July 17-21, 2017. CICM 2017 also invites work-in-
progress papers.
The principal tracks of the conference are:
* Track: Calculemus (chair: Matthew England)
* Track: Digital Mathematical Libraries (DML) (chair: Olaf Teschke)
* Track: Mathematical Knowledge Management (MKM) (chair: Florian Rabe)
* Track: Systems & Data (chair: Osman Hasan)
* Track: Doctoral Programme (chair: Adnan Rashid)
The overall programme is organized by the General Program Chair Herman
Geuvers. The local arrangements are coordinated by Jacques
Fleuriot. The publicity chair is Serge Autexier.
CICM is also an excellent opportunity for graduate students to meet
established researchers from the areas of computer algebra, automated
deduction, and mathematical publishing.
The Doctoral Programme provides a dedicated forum for PhD students to
present and discuss their ideas, ongoing or planned research, and
achieved results in an open atmosphere. It will consist of
presentations by the PhD students to get constructive feedback,
advice, and suggestions from the research advisory board, researchers,
and other PhD students. Each PhD student will be assigned to an
experienced researcher from the research advisory board who will act
as a mentor and who will provide detailed feedback and advice on their
intended and ongoing research.
Submission to the doctoral programs are possible until 5. June, details
of the submission process are given below.
*Important Dates*
- Submission deadline (Abstract + CV): 5. June 2017
- Notification of acceptance: 8. June 2017
*Submission*
- Submission by e-mail to the DP chair Adnan Rashid
Email: adnan.rashid(a)seecs.edu.pk
Web: http://save.seecs.nust.edu.pk/adnanrashid/
More details on the conference are available from
http://www.cicm-conference.org/2017