Hi Everyone,
The next Research Showcase will be live-streamed this Wednesday, October
19, 2016 at 11:30 AM (PST) 18:30 (UTC).
Link for remote presenters to join the Hangout on Air:
As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#October_2016>.
YouTube stream: https://www.youtube.com/watch?v=cBImUZ_si5s
This month's showcase includes.
Human centered design for using and editing structured data in Wikipedia
infoboxesBy *Charlie Kritschmar
<https://www.mediawiki.org/wiki/User:Charlie_Kritschmar_(WMDE)> UX
Intern, Wikimedia Deutschland
<https://meta.wikimedia.org/wiki/Wikimedia_Deutschland>*Wikidata is a
Wikimedia project which stores structured data to be used by other
Wikimedia projects like Wikipedia. Currently, integrating its data in
Wikipedia is difficult for users, since there’s no predefined way to do so
and requires some technical knowledge. To tackle these issues,
human-centered design methods were applied to find needs from which
solutions were generated and evaluated with the help of the community. The
concept may serve as a basis which may be implemented into various Wiki
projects in the future to make editing Wikidata from within another
Wikimedia project more user-friendly and improve the project’s acceptance
in the community.
Emergent Work in WikipediaBy *Ofer Arazy
<http://oferarazy.com/> (University of Haifa)*Online production communities
present an exciting opportunity for investigating novel organizational
forms. Extant theoretical accounts of knowledge co-production point to
organizational policies, norms, and communication as key mechanisms
enabling the coordination of work. Yet, in practice participants in
initiatives such as Wikipedia are often occasional contributors who are
unaware of community policies and do not communicate with other members.
How then is work coordinated and how does the organization maintain
stability in the face of dynamics in individuals’ task enactment? In this
study we develop a conceptualization of emergent roles - the prototypical
activity patterns that organically emerge from individuals’ spontaneous
actions – and investigate the temporal dynamics of emergent role behaviors.
Conducing a multi-level large-scale empirical study stretching over a
decade, we tracked co-production of a thousand Wikipedia articles, logging
two hundred thousand distinct participants and seven hundred thousand
co-production activities. Using a combination of manual tagging and machine
learning, we annotated each activity type, and then clustered participants’
activity profiles to arrive at seven prototypical emergent roles. Our
analysis shows that participants’ behavior is turbulent, with substantial
flow in and out of co-production work and across roles. Our findings at the
organizational level, however, show that work is organized around a highly
stable set of emergent roles, despite the absence of traditional
stabilizing mechanisms such as pre-defined work procedures or role
expectations. We conceptualize this dualism in emergent work as “Turbulent
Stability”. Further analyses suggest that co-production is
artifact-centric, where contributors mutually adjust according to the
artifact’s changing needs. Our study advances the theoretical
understandings of self-organizing knowledge co-production and particularly
the nature of emergent roles.
Hope to see you there!
Sarah R. Rodlund
Senior Project Coordinator-Engineering, Wikimedia Foundation
srodlund(a)wikimedia.org
Hi,
Does Wikipedia stored any metadata, logs or anything useful to track ones
activity?
For instance, visited web pages, grants, additional user info etc.
(anything beyond the known text edits and page talks).
We are trying to capture all available piece of information regarding one's
behavior in Wikipedia.
Any ideas?
Thanks,
Alex
Hi everybody,
We’re preparing for the September 2016 research newsletter and looking for contributors. Please take a look at: https://etherpad.wikimedia.org/p/WRN201609 and add your name next to any paper you are interested in covering. The publication schedule is a bit mixed up currently - there is a chance we will already need to get out this issue in the next few days; but if you prefer to take more time, feel free to mark your contribution for the subsequent October issue instead, which should come out toward the end of this month. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
• 5000 people on Brexit & US Elections
• A Smooth Transition to Modern mathoid-based Math Rendering in Wikipedia with Automatic Visual Regression Testing
• Answering End-User Questions, Queries and Searches on Wikipedia and its History
• Automated News Suggestions for Populating Wikipedia Entity Page
• Content Disputes in Wikipedia Reflect Geopolitical Instability
• Creating Causal Embeddings for Question Answering with Minimal Supervision
• Cultural Differences in the Understanding of History on Wikipedia
• Examining potential mechanisms underlying the Wikipedia gender gap through a collaborative editing task
• Expanding Wikidata's Parenthood Information by 178%, or How To Mine Relation Cardinalities
• Exploration on the Use of WDQS: Breakdown by Geography, User Agent and Referer Class
• Finding News Citations For Wikipedia
• Gender gap on Wikipedia: visible in all categories?
• How do students trust Wikipedia? An examination across genders
• Incorporating Relation Paths in Neural Relation Extraction
• Memory Remains: Understanding Collective Memory in the Digital Age
• Once You Step Over the First Line, You Become Sensitized to the Next: Towards a Gateway Theory of Online Participation
• Privacy, Anonymity, and Perceived Risk in Open Collaboration: A Study of Tor Users and Wikipedians
• Quality and Importance of Wikipedia Articles in Different Languages
• Using Semantic Web Technologies for Explaining and Predicting Abnormal Expenses
• Veni, Vidi, Vicipaedia: Using the Latin Wikipedia in an Advanced Latin Classroom
• WikInfoboxer: A Tool to Create Wikipedia Infoboxes Using Dbpedia
• Wikipedia and participatory culture: Why fans edit
• Writing for Wikipedia in the classroom: challenging official knowledge (a case study in 12th grade)
If you have any question about the format or process feel free to get in touch off-list.
Masssly, Tilman Bayer and Dario Taraborelli
[1] http://meta.wikimedia.org/wiki/Research:Newsletter
Dear Wikipedia and MediaWiki people,
Hers are some suggested ideas that may allow Wikipedia and MediaWiki to be
organized better in the future and for the future of organizing the worlds
open public information.
*Summaries - popup summaries for pages*
Using automated generation of content for the title attribute on the <a>
tag containing a summary containing either the content from an
<article><header><section id="summary"> or a designated section from
Wikimedia markdown a popup summary could be generated for quick browsing
for definition of terms on hyperlinks. This would vastly aid the user
experience.
*Categories - bread crumb like hierarchical and cross referencing
categorization and navigation*
By creating a set of categorical navigation pages the whole of fields of
knowledge on Wikipedia could be categorized.
By having a set of clickable list of hierarchical categories displayed like
'bread crumb' navigation lists under the page title the user could quickly
navigate this hierarchy.
By adding pop up menus to the separating chevrons with each subcategories
elements cross category navigation would be made possible.
Double clicking on chevrons should navigate to the categorical navigation
page.
*QuickLink - Quick Link Creation*
A hotkey and JavaScript script could allow the creation of links from a
selected highlighted bit of normal text to lookup a term, display its
summary and allow the user to confirm the generation of a new hyperlink
very quickly without having to edit markdown.
*Move towards semantic content*
By using new HTML elements like <article>, <section> and <header> and id's
and classes more of a semantic mapping of content may be established. Tis
maybe done incrementally and also for example by a bot auto generating new
summary information that maybe verified by either users or editors for
publishing.
*API*
API's from summaries, categories, and semantic content should be made
available.
More to come ...
Regards,
Aaron Gray
Hey folks,
I just finished working with Amir[1,2] and building off of some of Morten's
work[3] to put together something that I think you're going to like.
Halfaker, Aaron (2016): Monthly Wikipedia article quality predictions.
> figshare.
> https://dx.doi.org/10.6084/m9.figshare.3859800
> Retrieved: 00 56, Oct 12, 2016 (GMT)
This dataset contains a row for every article-month since 20010101. Each
row has an article quality prediction based on text-only machine classifier
(from [3] with slight improvement) and hosted by ORES[4]. We've managed to
build models for English, French, and Russian Wikipedia, so I've generated
datasets for each of those wikis. It's current as of 2016-08-01 and I plan
to run updates periodically.
Here are the columns:
- page_id -- The page identifier
- page_title -- The title of the article (UTF-8_with_underscores)
- rev_id -- The most recent revision ID at the time of assessment
- timestamp -- The timestamp when the assessment was taken
(YYYYMMDDHHMMSS)
- prediction -- The predicted quality class ("Stub", "Start", "C", "B",
"GA", "FA", ...)
- weighted_sum -- The sum of prediction weights assuming indexed class
ordering ("Stub" = 0, "Start" = 1, ...)
I'll update the docs based on your questions :)
1. https://phabricator.wikimedia.org/p/Ladsgroup/
2. https://github.com/Ladsgroup
3. http://www-users.cs.umn.edu/~morten/publications/
wikisym2013-tellmemore.pdf
4. https://ores.wikimedia.org/
-Aaron
Hi there,
I would like this to be a place to discuss the grant proposed here:
https://meta.wikimedia.org/wiki/Grants:Project/Arc.heolo.gy
The goal of the project is to provide a powerful semantic library to
analyze and visualize relationships that might otherwise be hidden within
the immense amount of knowledge contained in Wikipedia.
Any feedback or critiques based on any aspect of the project including tech
stack, methodology, community engagement practices, or goals, would be
HUGELY appreciated.
We are also looking for volunteers who are interested in devops, graph
technology, NLP (word2vec or otherwise), or data visualization!
Thank you for your time,
Ian Seyer
--
╭╮
╭╮┃┃
╭╮ ╭╮┃┃┃┃╭╮
┃┃ ╭╮ ┃╰╯╰╯┃┃╰
╭╮┃┃╭╮┃┃╭╮┃ ╰╯
╭╮ ┃┃┃┃┃╰╯┃┃╰╯
┃┃╭╮┃╰╯┃┃ ╰╯
╮┃╰╯┃┃ ╰╯
╰╯ ┃┃
╰╯
*Wikimedia Documents - PDF Scraping and creating living working documents*
Wikimedia should have a way of creating documents that can easily be
imported and exported to PDF, Word and other formats.
- Document versioning as well as history should be able to be provided.
- Authorship and control of authorship I also needed.
- Fixed and editable status should also be provided
- Document licensing should also be feature, this should allow management
of information from different sources.
Document licensing should allow copying and collation to be done in
controlled way allowing information dissemination.
By scraping the content of PDF documents that are put into the public
domain or under open license that permits modification and that permission
is given to subsume and render them into MediaWiki Pages in modifiable
state. Those that are in the public domain or under a fixed content license
may still be rendered into MediaWiki Pages.
Quick access to the original document and modification history should be
mandatory at the top of every original document page. Auto quotations and
citations can also be generated at the bottom of the pages
More to come ...
Regards,
Aaron Gray
On 9 October 2016 at 13:37, Aaron Gray <aaronngray.lists(a)gmail.com> wrote:
> Dear Wikipedia and MediaWiki people,
>
> Hers are some suggested ideas that may allow Wikipedia and MediaWiki to be
> organized better in the future and for the future of organizing the worlds
> open public information.
>
> *Summaries - popup summaries for pages*
>
> Using automated generation of content for the title attribute on the <a>
> tag containing a summary containing either the content from an
> <article><header><section id="summary"> or a designated section from
> Wikimedia markdown a popup summary could be generated for quick browsing
> for definition of terms on hyperlinks. This would vastly aid the user
> experience.
>
>
> *Categories - bread crumb like hierarchical and cross referencing
> categorization and navigation*
> By creating a set of categorical navigation pages the whole of fields of
> knowledge on Wikipedia could be categorized.
>
> By having a set of clickable list of hierarchical categories displayed
> like 'bread crumb' navigation lists under the page title the user could
> quickly navigate this hierarchy.
>
> By adding pop up menus to the separating chevrons with each subcategories
> elements cross category navigation would be made possible.
>
> Double clicking on chevrons should navigate to the categorical navigation
> page.
>
> *QuickLink - Quick Link Creation*
>
> A hotkey and JavaScript script could allow the creation of links from a
> selected highlighted bit of normal text to lookup a term, display its
> summary and allow the user to confirm the generation of a new hyperlink
> very quickly without having to edit markdown.
>
> *Move towards semantic content*
>
> By using new HTML elements like <article>, <section> and <header> and id's
> and classes more of a semantic mapping of content may be established. Tis
> maybe done incrementally and also for example by a bot auto generating new
> summary information that maybe verified by either users or editors for
> publishing.
>
> *API*
>
> API's from summaries, categories, and semantic content should be made
> available.
>
> More to come ...
>
> Regards,
>
> Aaron Gray
>
>
Apologies for cross-postings
********************************
2017 International Conference on Social Media & Society (#SMSociety)
WHEN: July 28-30, 2017
WHERE: Toronto, Canada (Ted Rogers School of Management, Ryerson University)
SUBMISSION DEADLINES:
Dec 5, 2016: Workshops, Tutorials, & Panels
Jan 16, 2017: Full & WIP Papers
Mar 6, 2017: Poster Abstracts
Conference website: <http://SocialMediaAndSociety.org>
http://SocialMediaAndSociety.org
2017 #SMSociety Theme: Social Media for Social Good or Evil
CALL FOR PROPOSALS
Our online behaviour is far from virtual--it extends our offline lives. Much
social media research has identified the positive opportunities of using
social media; for example, how people use social media to form support
groups online, participate in political uprising, raise money for charities,
extend teaching and learning outside the classroom, etc. However, mirroring
offline experiences, we have also seen social media being used to spread
propaganda and misinformation, recruit terrorists, live stream criminal
activities, reinforce echo chambers by politicians, and perpetuate hate and
oppression (such as racist, sexist, homophobic, and anti-Semitic behaviour).
Furthermore, behind the posts are algorithms, power structures, commercial
interests and other factors that surreptitiously influence our experiences
on social media. So, we ask:
* What does it actually mean to use social media for social good?
* How can social media be further leveraged for social justice? What
are the threats to meaningful participation and how can we overcome these
threats?
* What do we know about the 4 W's of who, what, why, where (and how)
do people engage in anti-social behaviour online?
* What theoretical and methodological tools can we use to study
anti-social behaviour? Can we detect such behaviour automatically?
* What are the ethics of algorithms (inclusion, accessibility, data
discrimination, bots)?
* What are the legal, policy, privacy, and ethical implications of
using social big data?
* Considering the proliferation of bots online, can we still trust
social media data?
* And more broadly, what are the major effects of using social media
on political, economic, individual, and social aspects of our society?
The 2017 International Conference on Social Media & Society (#SMSociety)
invites scholarly and original submissions that relate to the broad theme of
Social Media & Society. We welcome both quantitative and qualitative work
which crosses interdisciplinary boundaries and expands our understanding of
the current and future trends in social media research, especially those
that explore some of the questions and issues raised above.
ABOUT THE CONFERENCE:
The International Conference on Social Media & Society (#SMSociety) is an
annual gathering of leading social media researchers from around the world.
Now, in its 8th year, the 2017 conference will be held in Toronto, Canada at
Ted Rogers School of Management, Ryerson University on July 28-30.
>From its inception, the Conference has focused on the best practices for
studying the impact and implications of social media on society. Our invited
industry and academic keynotes have highlighted the shifting questions and
concerns for the social media research community. From introducing media
multiplexity and networked individualism with Caroline Haythornthwaite and
Barry Wellman in 2010 and 2011, to measuring influence with Gilad Lotan and
Sharad Goel in 2012 and 2013, to defining social media research as a field
with Keith Hampton in 2014, to identifying our commitments as social media
researchers in policy making with Bill Dutton in 2015, to exploring the
future of social media technologies with John Weigelt in 2015, to
highlighting the challenges of social media data mining in the context of
big data with Susan Halford and Helen Kennedy in 2016.
Organized by the <http://socialmedialab.ca/> Social Media Lab at
<http://www.ryerson.ca/tedrogersschool/> Ted Rogers School of Management at
Ryerson University, the conference provides participants with opportunities
to exchange ideas, present original research, learn about recent and ongoing
studies, and network with peers. The conference's intensive three-day
program features workshops, full papers, work-in-progress papers, panels,
and posters. The wide-ranging topics in social media showcase research from
scholars working in many fields including Communication, Computer Science,
Education, Journalism, Information Science, Management, Political Science,
Sociology, Social Work, etc.
SUBMISSION DETAILS:
See online at https://socialmediaandsociety.org/submit/
PUBLISHING OPPORTUNITIES:
Full and WIP (short) papers presented at the Conference will be published in
the conference proceedings by
<http://dl.acm.org/citation.cfm?id=2930971&CFID=847001369&CFTOKEN=15617273&p
reflayout=flat#prox> ACM International Conference Proceeding Series (ICPS)
and will be available in the ACM Digital Library. All conference presenters
will be invited to submit their work as a full paper to the special issue of
the <http://sms.sagepub.com/> Social Media + Society journal (published by
SAGE).
TOPICS OF INTEREST:
Social Media Impact on Society
. Political Mobilization & Engagement
. Extremism & Terrorism
. Politics of Hate and Oppression
. The Sharing/Attention Economy
. Social Media & Health
. Virality & Memes
. Social Media & Social Justice
. Social Media & Business (Marketing, PR, HR, Risk Management, etc.)
. Social Media & Academia (Alternative Metrics, Learning Analytics, etc.)
. Social Media & Public Administration
. Social Media & the News
Online/Offline Communities
. Trust & Credibility in Social Media
. Online Community Detection
. Influential User Detection
. Identity
Social Media & Small Data
. Case Studies of Online Communities Formed on Social Media
. Case Studies of Offline Communities that Rely on Social Media
. Sampling Issues
. Value of Small Data
Social Media & Big Data
. Visualization of Social Media Data
. Social Media Data Mining
. Scalability Issues & Social Media Data
. Social Media Analytics
. Ethics of Big Data/Algorithms
Theories & Methods
. Qualitative & Quantitative Approaches
. Opinion Mining & Sentiment Analysis
. Social Network Analysis
. Theoretical Models for Studying, Analysing and Understanding Social Media
Social Media & Mobile
. App-ification of Society
. Privacy & Security Issues in the Mobile World
. Apps for the Social Good
. Networking Apps
ORGANIZING COMMITTEE:
Anatoliy Gruzd, Ryerson University, Canada - Conference Chair
Jenna Jacobson, University of Toronto, Canada - Conference Chair
Philip Mai, Ryerson University, Canada - Conference Chair
K. Hazel Kwon, Arizona State University, USA - Poster Chair
ADVISORY BOARD:
William H. Dutton, Michigan State University, USA
Zizi Papacharissi, University of Illinois at Chicago, USA
Barry Wellman, INSNA Founder, The Netlab Network