Wiki-research-l January 2018

wiki-research-l@lists.wikimedia.org

30 participants
29 discussions

Wikipedia Research policy
by song＠cs.umn.edu 14 Jul '23

14 Jul '23

Pursuant to prior discussions about the need for a research policy on Wikipedia, WikiProject Research is drafting a policy regarding the recruitment of Wikipedia users to participate in studies. At this time, we have a proposed policy, and an accompanying group that would facilitate recruitment of subjects in much the same way that the Bot Approvals Group approves bots. The policy proposal can be found at: http://en.wikipedia.org/wiki/Wikipedia:Research The Subject Recruitment Approvals Group mentioned in the proposal is being described at: http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group Before we move forward with seeking approval from the Wikipedia community, we would like additional input about the proposal, and would welcome additional help improving it. Also, please consider participating in WikiProject Research at: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research -- Bryan Song GroupLens Research University of Minnesota

8 10

[Analytics] Beeline as Hive client
by Madhumitha Viswanathan 03 Oct '18

03 Oct '18

Hi all, For all Hive users using stat1002/1004, you might have seen a deprecation warning when you launch the hive client - that claims it's being replaced with Beeline. The Beeline shell has always been available to use, but it required supplying a database connection string every time, which was pretty annoying. We now have a wrapper <https://github.com/wikimedia/operations-puppet/blob/production/modules/role…> script setup to make this easier. The old Hive CLI will continue to exist, but we encourage moving over to Beeline. You can use it by logging into the stat1002/1004 boxes as usual, and launching `beeline`. There is some documentation on this here: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Beeline. If you run into any issues using this interface, please ping us on the Analytics list or #wikimedia-analytics or file a bug on Phabricator <http://phabricator.wikimedia.org/tag/analytics>. (If you are wondering stat1004 whaaat - there should be an announcement coming up about it soon!) Best, --Madhu :)

2 2

What percentage of digital assistants cite Wikipedia?
by Stella Yu 09 Jun '18

09 Jun '18

Curious, what percentage of digital assistants (Alexa, Siri, Cortana, Google) cite Wikipedia when a person asks a question? Does the current Wikipedia mobile app support voice search? Are there any reports on this? Thanks in advance! Sincere regards, Stella -- Stella Yu | STELLARESULTS | 415 690 7827 "Chronicling heritage brands and legendary people."

8 10

Wiki Workshop 2018 Announcement and Call for Papers
by Leila Zia 19 Apr '18

19 Apr '18

Hi everyone, We are excited to announce that the 5th annual Wiki Workshop [1] will take place in Lyon on April 24, 2018 and as part of The Web Conference 2018 (a.k.a. WWW2018) [2]. You can access the call for papers at http://wikiworkshop.org/2018/#call . Please submit your ongoing or completed research related to Wikimedia projects to the workshop. Note that 2018-01-28 is the submission deadline if you want your paper to appear in the proceedings, and 2018-03-11 is for all other papers.[3] Following the past year's model, the workshop will have a set of invited talks (Jon Kleinberg and Markus Kroetzsch have already accepted our invitation [4] \o/), a poster session, and more. Questions and comments are welcome. Otherwise, we're looking forward to receiving your submissions and seeing you in Lyon in April. :) Best, Leila, on behalf of the organizers [5] [1] http://wikiworkshop.org/2018/ [2] https://www2018.thewebconf.org/ [3] http://wikiworkshop.org/2018/#dates [4] http://wikiworkshop.org/2018/#speakers [5] http://wikiworkshop.org/2018/#organization -- Leila Zia Senior Research Scientist Wikimedia Foundation

1 1

Research Project on Real-Time Vandal Detection during Editing
by Martin Potthast 16 Feb '18

16 Feb '18

Hi everyone, we [1] would like to announce a research project with the goal of studying whether user interactions recorded at the time of editing are suitable to predict vandalism in real time. Should vandal editing behavior be sufficiently different from normal editing behavior, this would allow for a number of interesting real-time prevention techniques. For example: - withholding confidently suspicious edits for review before publishing them, - a popup asking "I am not a vandal" (as in Google's "I am not a robot") to analyze vandal reactions, - a popup with a chat box to personally engage vandals, e.g., to help them find other ways of stress relief or to understand them better, - or at the very least: a new signal to improve traditional vandalism detectors. We have set up a laboratory environment to study editor behavior in a realistic setting using a private mirror of Wikipedia. No editing whatsoever is conducted on the real Wikipedia as part of our experiments, and all test subjects of our user studies are made aware of the experimental nature of their editing. We plan on making use of crowdsourcing as a means to attain scale and diversity. If you wish to participate in this study as a test subject yourself, please get in touch. The more diversity, the more insightful the results will be. We are also happy to collaborate and to answer all questions that may arise in relation to the project. For example, our setup and tooling may turn out to be useful to study other user behavior-related things without having to actually deploy experiments within the live MediaWiki. Best, Martin PS: The AICaptcha project seems most closely related. @Vinitha and Gergő: If you wish, we can set up a Skype meeting to talk about a avenues for collaboration. [1] A group of students and researchers from Bauhaus-Universität Weimar ( www.webis.de) and Leipzig University (www.temir.org); project PI: Martin Potthast.

3 3

Analytics Hadoop cluster maintenance announce for Feb 6th
by Luca Toscano 05 Feb '18

05 Feb '18

*TL;DR*: The Analytics Hadoop cluster will be completely down for max 2h on *Feb 6th* (EU/CET morning) to upgrade all the daemons to Java 8. Hi everybody, we are planning to upgrade the Analytics Hadoop cluster to Java 8 on *Feb 6th* (EU/CET morning) for https://phabricator.wikimedia.org/T166248. Sadly we can't do a rolling upgrade of all the jvm-based Hadoop daemons since the distribution that we use (Cloudera) suggests to perform the upgrade only after a complete cluster shutdown. This means that for a couple of hours (hopefully a lot less) all the Hadoop based services will be unavailable (Hive, Oozie, HDFS, etc..). We have tested the new configuration in labs and all the regular Analytics jobs seem to work correctly, so we don't expect major issues after the upgrade, but if you have any question or concern please follow up in the task. Thanks! Luca and Andrew (on behalf of the Analytics team)

1 1

San Francisco union rules
by James Salsman 25 Jan '18

25 Jan '18

> you can't have volunteer resources run the video > equipment for you. You need a professional crew. Is there a more specific description of the specific authority, wording, and legislative intent for this, please?

3 2

OpenSym 2018 | August 22-24, 2018 | Paris, France | General Call for Papers | Deadline March 15, 2018
by Nicolas Jullien 23 Jan '18

23 Jan '18

Dear all, it's my pleasure to inform you that the call for paper for OpenSym 2018 is available. Conference Website and call for papers: http://opensym.org Papers are due by March 15, 23h59 (any time on Earth). Submission: https://easychair.org/conferences/?conf=opensym2018 (accepted rate in 2017: 45%) Topics: The conference provides peer-reviewed research tracks on subjects related to open collaboration including: - Open Collaboration Research, esp. Wikis and Social Media - Free/Libre and Open Source Software (FLOSS) - Open Data, Open Access, and Open Science - Open Education - IT-Driven Open Innovation - Open Policy/Open Government/Open Law - Wikipedia and Wikimedia Research Looking forward to seeing your paper's presentation in Paris Nicolas Jullien, general chair of OpenSym 2018 About the Conference -------------------- OpenSym is the only conference that brings together the different strands of open collaboration research and practice, seeking to create synergies and inspire new collaborations between people from computer science, information science, social science, humanities, and everyone interested in understanding open collaboration and how it is changing our society. This year’s conference will be held in Paris, France on August 22-24, 2018. A Doctoral Symposium will take place on August 21, 2018. OpenSym is held in-cooperation with ACM SIGWEB and ACM SIGSOFT and the conference proceedings will be archived in the ACM digital library like all prior editions. Submission Information and Instructions --------------------------------------- Topics: The conference provides peer-reviewed research tracks on subjects related to open collaboration including: - Open Collaboration Research, esp. Wikis and Social Media - Free/Libre and Open Source Software (FLOSS) - Open Data, Open Access, and Open Science - Open Education - IT-Driven Open Innovation - Open Policy/Open Government/Open Law - Wikipedia and Wikimedia Research Paper Presentation: OpenSym 2018 will be organized as a one track conference in order to emphasize the interdisciplinary character of this conference and to encourage discussion. Submission Deadline: The research paper submission deadline is March 15th 2018. Submitted papers should present integrative reviews or original reports of substantive new work: theoretical, empirical, and/or in the design, development and/or deployment of novel concepts, systems, and mechanisms. Research papers will be reviewed to meet rigorous academic standards of publication. Papers will be reviewed for relevance, conceptual quality, innovation and clarity of presentation. All the submissions are done via the EasyChair platform, here: https://easychair.org/conferences/?conf=opensym2018 Paper Length: There is no minimum or maximum length for submitted papers. Rather, reviewers will be instructed to weigh the contribution of a paper relative to its length. Papers should report research thoroughly but succinctly: brevity is a virtue. A typical length of a “long research paper” is 10 pages (formerly the maximum length limit and the limit on OpenSym tracks), but may be shorter if the contribution can be described and supported in fewer pages—shorter, more focused papers (called “short research papers” previously) are encouraged and will be reviewed like any other paper. While we will review papers longer than 10 pages, the contribution must warrant the extra length. Reviewers will be instructed to reject papers whose length is incommensurate with the size of their contribution. Papers should be formatted in ACM SIGCHI paper format. Reviewing is not double-blind so manuscripts do not need to be anonymized. Posters: As in previous years, OpenSym will also be hosting a poster session at the conference. To propose a poster, authors should submit an extended abstract (not more than 4 pages) describing the content of the poster which will be published in a non-archival companion proceedings to the conference. Posters should use the ACM SIGCHI templates for extended abstracts. An example of a poster abstract can be found here. Reviewing is not double-blind so abstracts do not need to be anonymized. Paper Proceedings: OpenSym is held in-cooperation with ACM SIGWEB and ACM SIGSOFT and the conference proceedings will be archived in the ACM digital library like all prior editions. OpenSym seeks to accommodate the needs of the different research disciplines it draws on including disciplines with archival conference proceedings and disciplines where authors usually present at conferences and publish later. Authors, whose submitted papers have been accepted for presentation at the conference have a choice of: having their paper become part of the official proceedings, archived in the ACM Digital Library, having their paper published in the conference website only, with no transfer of copyright from the authors, having no publication record at all but only the presentation at the conference. Response from authors: For the second time at OpenSym, authors will be given the opportunity to write a response to their reviews before final decisions are made. This should be treated as an opportunity to correct any mistakes or misconceptions in the reviews as well as to propose minor changes that the authors can make during the two weeks between notification and the camera-ready deadline. Important Dates Submission deadline: March 15, 2018 Reviews sent to authors: May 11, 2018 Response to reviews from authors due: May 20, 2018 Final decision notification: June 15, 2018 Camera-ready papers due: June 22, 2018 Papers available online: July 13, 2018 Conference Organization The general chairs of the conference are Nicolas Jullien and Olivier Berger, IMT, France. Feel free to contact us with any questions you might have at info(a)opensym.org. -- Maître de Conférences (HDR) / Associate Professor. https://nicolasjullien.wp.mines-telecom.fr/ Directeur de M@rsouin http://www.marsouin.org Membre du LEGO http://labo-lego.fr Responsable du M2 management innovation parcours Mgt du SI et des données @ischool IMT Atlantique https://innovationmanagement.wp.imt.fr/ -- Maître de Conférences (HDR) / Associate Professor. https://nicolasjullien.wp.mines-telecom.fr/ Directeur de M@rsouin http://www.marsouin.org Membre du LEGO http://labo-lego.fr Responsable du M2 management innovation parcours Mgt du SI et des données @ischool IMT Atlantique https://innovationmanagement.wp.imt.fr/

5 8

Upcoming research newsletter: new papers open for review
by Tilman Bayer 22 Jan '18

22 Jan '18

Resending the email below, as it does not have seem to have made to the inboxes of several recipients - including myself - even though it is recorded in the list's archives <https://lists.wikimedia.org/pipermail/wiki-research-l/2018-January/date.html> . --- From: masssly at ymail.com Subject: [Wiki-research-l] Upcoming research newsletter: new papers open for review Date: Fri Jan 19 20:12:18 UTC 2018 Hi everyone, We’re preparing for the January 2018 research newsletter and looking for contributors. Please take a look at: https://etherpad.wikimedia.org/p/WRN201801 and add your name next to any paper you are interested in covering. Our target publication date is on January 26 UTC. As usual, short notes and one-paragraph reviews are most welcome. Highlights from this month: • Can conference papers have information value through Wikipedia? An investigation of four engineering fields • Collaborative Approach to Developing a Multilingual Ontology: A Case Study of Wikidata • Determining Quality of Articles in Polish Wikipedia Based on Linguistic Features • Emo, Love, and God: Making Sense of Urban Dictionary, a Crowd-Sourced Online Dictionary • Fostering Public Good Contributions with Symbolic Awards: A Large-Scale Natural Field Experiment at Wikipedia • Knowledge categorization affects popularity and quality of Wikipedia articles • The Conceptual Correspondence between the Encyclopaedia and Wikipedia • The Wisdom of Polarized Crowds • Use of Louisiana's Digital Cultural Heritage by Wikipedians • What Makes Wikipedia's Volunteer Editors Volunteer? • Wikipedia-integrated publishing: a comparison of successful models If you have any question about the format or process feel free to get in touch off-list. Masssly, Tilman Bayer and Dario Taraborelli [1] http://meta.wikimedia.org/wiki/Research:Newsletter -- Tilman Bayer Senior Analyst Wikimedia Foundation IRC (Freenode): HaeB

1 0

Fwd: [Input requested] Knowledge as a Service at the Wikimedia Developer Summit 2018
by Adam Baso 22 Jan '18

22 Jan '18

Cross-post. ---------- Forwarded message ---------- From: Adam Baso <abaso(a)wikimedia.org> Date: Thu, Jan 18, 2018 at 6:38 AM Subject: [Input requested] Knowledge as a Service at the Wikimedia Developer Summit 2018 To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org> Howdy Wikitechnorati, (And thank you for patience with me cross-posting if you're on other lists.) I'm writing to invite your input on the following Phabricator task ahead of next week's Wikimedia Developer Summit 2018 [1] session. Knowledge as a Service https://phabricator.wikimedia.org/T183315 The purpose [2] of the Wikimedia Developer Summit 2018 sessions is to provide guidance for Phase 2 of the Movement Strategic Direction [3] on buildout of technology capabilities. We'd really love your thoughts to help set context for our session next week, as Knowledge as a Service is a primary consideration in the Movement Strategic Direction. What is Knowledge as a Service? Its essence is about information architecture approaches and the necessary software that will ultimately allow content consumption and creation to radiate to new and different types of interfaces and devices in addition to browser-based approaches. As you review position papers from attendees [4] you'll notice that the way they (myself included) think about best solving this is through a heavy emphasis on technology that makes it easier to better structure information and its metadata for re-use, remixing, and querying. What might this mean? Does it mean we should build Wikimedia software in an API- and metadata-first manner following industry standards compatible with content structuration? Does it mean weaving our existing structured and semi-structured data technologies together? How do we build technology that can ensure successful collaboration between communities on increasingly structured and interdependent information sources? And how can we ensure the tech will bolster growth of multilingual and multimedia content creation and consumption? I've copied some of the essential material from the Movement Strategic Direction concerning Knowledge as a Service so you have it here. We would appreciate your input and hope you will subscribe to the Phabricator task to contribute and follow along as we explore this topic. https://phabricator.wikimedia.org/T183315 The following content is copied from https://meta.wikimedia.org/ wiki/Strategy/Wikimedia_movement/2017/Direction : Knowledge as a service: To serve our users, we will become a platform that serves open knowledge to the world across interfaces and communities. We will build tools for allies and partners to organize and exchange free knowledge beyond Wikimedia. Our infrastructure will enable us and others to collect and use different forms of free, trusted knowledge. ... As technology spreads through every aspect of our lives, Wikimedia's infrastructure needs to be able to communicate easily with other connected systems. ... As a platform, we need to transform our structures to support new formats, new interfaces, and new types of knowledge. We have a strategic opportunity to go further and offer this platform as a service to other institutions, beyond Wikimedia. In a world that is becoming more and more connected, building the infrastructure for knowledge gives others a vested interest in our success. It is how we ensure our place in the larger network of knowledge, and become an essential part of it. As a service to users, we need to build the platform for knowledge or, in jargon, provide knowledge as a service. ... Knowledge as a service: A platform that serves open knowledge to the world across interfaces and communities Our openness will ensure that our decisions are fair, that we are accountable to one another, and that we act in the public interest. Our systems will follow the evolution of technology. We will transform our platform to work across digital formats, devices, and interfaces. The distributed structure of our network will help us adapt to local contexts. ... We will build tools for allies and partners to organize and exchange free knowledge beyond Wikimedia. We will continue to build the infrastructure for free knowledge for our communities. We will go further by offering it as a service to others in the network of knowledge. We will continue to build the partnerships that enable us to develop knowledge we can't create ourselves. ... Our infrastructure will enable us and others to collect and use different forms of free, trusted knowledge. We will build the technical infrastructures that enable us to collect free knowledge in all forms and languages. We will use our position as a leader in the ecosystem of knowledge to advance our ideals of freedom and fairness. We will build the technical structures and the social agreements that enable us to trust the new knowledge we compile. We will focus on highly structured information to facilitate its exchange and reuse in multiple contexts. Thank you. -Adam [1] https://www.mediawiki.org/wiki/Wikimedia_Developer_Summit/2018 [2] https://www.mediawiki.org/wiki/Wikimedia_Developer_ Summit/2018/Purpose_and_Results [3] https://meta.wikimedia.org/wiki/Strategy/Wikimedia_ movement/2017/Direction [4] https://wikifarm.wmflabs.org/devsummit/index.php/Session:10

2 3

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l January 2018