Wiki-research-l January 2019

wiki-research-l@lists.wikimedia.org

28 participants
22 discussions

Wikipedia Research policy
by song＠cs.umn.edu 14 Jul '23

14 Jul '23

Pursuant to prior discussions about the need for a research policy on Wikipedia, WikiProject Research is drafting a policy regarding the recruitment of Wikipedia users to participate in studies. At this time, we have a proposed policy, and an accompanying group that would facilitate recruitment of subjects in much the same way that the Bot Approvals Group approves bots. The policy proposal can be found at: http://en.wikipedia.org/wiki/Wikipedia:Research The Subject Recruitment Approvals Group mentioned in the proposal is being described at: http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group Before we move forward with seeking approval from the Wikipedia community, we would like additional input about the proposal, and would welcome additional help improving it. Also, please consider participating in WikiProject Research at: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research -- Bryan Song GroupLens Research University of Minnesota

8 10

First Introduction [Kaushik Reddy]
by K. Kaushik Reddy 02 Feb '19

02 Feb '19

Hi developers, This is K. Kaushik Reddy from India. I'm keen into Data science & good with python. I am looking to participate in GSoC'19 for your organisation. I also have gone through the ideas page for GSoC'19 , but looking for few data science projects with python as the code base this year. In future, I'm planning to take Research & Development as my career. I always wanted to know how you people plan and execute a project in action. Secondly, Could you please guide me with relevant bugs that need to be fixed, so as to get involved in GSoC'19 along with you guys. Hoping to learning more and have fun with you people. Best, K. Kaushik Reddy.

3 2

CfP 5th International Conference on Computational Social Science (IC2S2 2019) at University of Amsterdam, The Netherlands
by Diliara Valeeva 28 Jan '19

28 Jan '19

IC2S2 2019 – 5th INTERNATIONAL CONFERENCE ON COMPUTATIONAL SOCIAL SCIENCE July 17-20, 2019 University of Amsterdam, The Netherlands Web:https://2019.ic2s2.org/ Twitter:@IC2S2 <https://twitter.com/IC2S2>/ #IC2S2 Deadline for abstracts: 5 February 2019 This summer the fifth edition of IC2S2 is hosted by the University of Amsterdam, The Netherlands. IC2S2 brings together researchers in computational science, complexity, and social science, and provides a platform for new work in the field of computational social science. Following a day of tutorials on July 17, contributed abstracts are presented orally in parallel thematic sessions or as posters at the three day conference, which takes place at the University of Amsterdam in the Netherlands from July 18 to 20. ================================================ KEYNOTE SPEAKERS ================================================ Kenneth Benoit (London School of Economics and Political Science) Jana Diesner (University of Illinois at Urbana Champaign) Deen Freelon (University of North Carolina) Mirta Galesic (Santa Fe Institute) Cesar Hidalgo (MIT Media Lab) Devra Moehler (Facebook Research) Scott Page (University of Michigan) Leto Peel (Université Catholique de Louvain) Emma Spiro (University of Washington) Lukas Vermeer (Booking.com) Claudia Wagner (GESIS Leibniz Institute for the Social Sciences) ================================================ SPECIAL SESSION “5 YEARS OF IC2S2" ================================================ After an opening talk by Duncan Watts (Microsoft Research), several renowned computational social scientists from the field will engage in a panel discussion on the future of the field of computational social science. Among confirmed panelists are: Ulrik Brandes (ETH Zürich) Sandra González-Bailón (University of Pennsylvania) Helen Margetts (University of Oxford, The Alan Turing Institute)Markus Strohmaier (RWTH Aachen University) Duncan Watts (Microsoft Research) =================================================== CALL FOR PAPERS (ABSTRACTS) =================================================== IC2S2 solicits abstracts from researchers in the social sciences with a clear component of computation, simulation or data analysis or data science. This includes for example sociology, psychology, communication science, anthropology, media studies, political science, public health, and economics. In addition, contributions from computer science, data science, and computational science with real-world applications in the social sciences or related fields, are welcome. We emphatically welcome abstracts that try to integrate both components. This is not limited to empirical studies; more general theoretical contributions are also welcome. Topics of interest include, but are not limited to, the following: * Network analysis of social systems * Large-scale social experiments * Agent-based or other simulations of social phenomena * Text analysis and natural language processing (NLP) of social phenomena * Cultural patterns and dynamics * Computational science studies (sociology of science) * Social news curation and collaborative filtering * Social media studies * Theoretical discussions in computational social science * Causal inference and computational methods for social science * Ethics in computational social sciences * Reproducibility in computational social science * Large scale infrastructure in computational social science * Novel digital data sources * Computational analyses for addressing societal challenges * Methods and analyses of observational social data * Computational social science research in industry Contributions to the conference should be submitted via EasyChair at:https://easychair.org/conferences/?conf=ic2s2-2019 <https://easychair.org/conferences/?conf=ic2s2-2019> Please follow the extended abstract template guidelines for Word (ic2s2-word-template.docx <https://2019.ic2s2.org/wp-content/uploads/2018/12/ic2s2-word-template.docx>) and LaTeX (ic2s2-latex-template.zip <https://2019.ic2s2.org/wp-content/uploads/2018/12/ic2s2-latex-template.zip>) for formatting instructions. Note that abstracts should be submitted as a PDFfile no larger than 20MB. Submissions that exceed the 2-page limit (including figures and references) will be automatically rejected. The extended abstract should include a title and a list of 5 keywords, but no authors’ names or affiliations. The abstract should outline the main contribution, data and methods used, results, and the impact of the work. Authors are encouraged to include one figure in their submission (the figure counts towards the page limit). Please do not include authors’ names and affiliations in the submitted document, as peer review will be double blind. Each extended abstract will be reviewed by multiple members of the Program Committee, composed of experts in computational social science. When submitting on EasyChair you will be asked to provide information about the authors and their affiliations and to include a one-sentence summaryof the extended abstract (20-50 words). The summary will be used for assigning reviewers. You can indicate a preference for an oral presentation or a poster presentation, but your preference may not be honored in the final decision. Submissions will be non-archival, and the presented work can be already published, in preparation for publication elsewhere, or ongoing research. Submission implies willingness to present a talk or poster at the conference. The deadline for abstract submission is February 5, 2019. ==================================================== IMPORTANT DATES ==================================================== * February 5, 2019 – Regular abstract submission deadline * March 6, 2019 – Acceptance notification * April 30, 2019 – Early bird registration deadline * June 30, 2019 – Registration deadline * July 17-20, 2019 – Conference ===================================================== SPONSORSHIP ===================================================== Do you want to affiliate your organisation to IC2S2? Bespoke sponsorship packages are available on request. Contact the local chairs Eelke Heemskerk and Frank Takes at ic2s2(a)uva.nl IC2S2 2019 is hosted byCORPNET <https://corpnet.uva.nl/>, University of Amsterdam and so far it is sponsored by Microsoft Research, Statistics Netherlands, ODISSEI, University of Amsterdam, Leiden University, Municipality of Amsterdam, Amsterdam Centre for Inequality Studies and SAGE Publishing. Please watch the video “IC2S2 comes to Amsterdam <https://www.youtube.com/watch?v=TUWvAysyqPk&t=2s>”. Web:https://2019.ic2s2.org/ Twitter:@IC2S2 <https://twitter.com/IC2S2>/ #IC2S2

1 0

¿Model to automatically classify if one user is bot or not?
by ABEL SERRANO JUSTE 25 Jan '19

25 Jan '19

Hello fellow wiki investigators! I have observed that, very often in wikis, users not in the bot groups are actually behaving like bots. Since the mediawiki api doesn't restrict normal users to automatize tasks through its API, you might have a "normal" user, actually doing bot things. I would like to identify those and consider them as bots. Is anyone aware if there's any implemented model already to classify whether an user is a bot or not? Thanks and nice weekend! -- Saludos, Abel.

3 5

research on productivity increases through anonymization
by James Salsman 25 Jan '19

25 Jan '19

"Anonymous calling: The WikiScanner scandals and anonymity on the Japanese Wikipedia" from yesterday's call for reviews was very good; I like this quote: > “If there is a user ID attached to a user, discussion tends to become a criticizing game. On the other hand, under an anonymous system, even though your opinion/information is criticized, you don’t know with whom to be upset. Also, with a user ID, those who participate in the site for a long time end to have authority, and it becomes difficult for a user to disagree with them. Under a perfectly anonymous system, you can say ‘it’s boring’, if it is actually boring. All information is treated equally; only an accurate argument will work.” [Andrew Lih. The Wikipedia revolution: How a bunch of nobodies created the world’s greatest encyclopedia. New York: Hyperion, 2009, p. 145.] Most of the research on the ways anonymity helps productivity went on during the design of double-blind experiments, and I hope more of that is done in the context of Wikipedia. I am too busy this month to do any reviews, and will be disappointed if someone else doesn't: https://firstmonday.org/ojs/index.php/fm/article/view/9184/7694 Thanks! Best regards, Jim

1 0

Why the world reads Wikipedia: beyond English
by Leila Zia 24 Jan '19

24 Jan '19

Hi all, As some of you know, we started a line of research back in 2016 to understand Wikipedia readers better. We published the first taxonomy of Wikipedia readers and we studied and characterized the reader types in English Wikipedia [1]. During the past 1+ year, we focused on learning about the potential differences of Wikipedia readers across languages based on the taxonomy built in [1]. We've learned a lot, and today we're sharing what we learnt with you. Some pointers: * Publication: https://arxiv.org/abs/1812.00474 * Data: https://figshare.com/articles/Why_the_World_Reads_Wikipedia/7579937/1 * (under continuous improvement) Research page on meta: https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Be… * Research showcase presentation: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#December_2018 * A series of presentations to WMF teams and community: Look for tasks under https://phabricator.wikimedia.org/T201699 with title "Present the results of WtWRW" for link to slides and more info when available. * We will send out a blog post about it hopefully soon. A blog post about the intermediate results is at https://wikimediafoundation.org/2018/03/15/why-the-world-reads-wikipedia/ In a nutshell: * We ran the taxonomy of Wikipedia readers in 14 languages and measured the prevalence of Wikipedia use-cases and characterized Wikipedia readers in these languages. * While we observe similarities in terms of the prevalence of the use cases as well as the way we can characterize readers, we can see that Wikipedia languages lend themselves to different distributions of readership and characteristics. In many cases, the one-size-fit-all solutions may simply not work for readers. * Intrinsic learning remains as the number one motivation for people to come to Wikipedia in the majority of the languages, followed by media. * In-depth reading and the reading of scientific oriented topics is highly and negatively correlated with the socio-economic status and Human Development Index of countries the readers in these languages are coming from. Long articles that may seem just too long for the bulk of our audience in US, Japan, and the Netherlands is in high demand in India, Bolivia, Argentina, Panamá, México, … * ... This research was not possible without the extensive contributions by our formal collaborators: Florian Lemmerich (RWTH Aachen University) and Bob West (EPFL). On the WMF end, I was fortunate to work with Diego Saez on this project as well as more recently, Isaac Johnson. And all those in the Reading Web and Legal team who supported us throughout the process. I also want to underline the amazing work that the volunteers in the languages in the study did to support us heavily to learn more about their languages, not only through help with communications within their communities but also with the translation task which was not an easy one as they were asked to offer their time not only to translate but also do in-person meetings with us for us to make sure the intent of the question is translated the same way across the languages. Usernames Strainu, Tgr, Amire80, Awossink, Antanana, Lyzzy, Shangkuanlc, Whym, Kaganer, عباد_ديرانية, Satdeep_Gill, Racso, Hasive: Thank you! Next we are going to extend this study to include demographics information. More information about it coming out in the next few weeks. (And I will send out a separate email to wikimedia-l about this topic and future research over the weekend. I need some time to finalize the message to make the message most useful for that audience.:) Best, Leila [1] https://arxiv.org/abs/1702.05379 -- Leila Zia Senior Research Scientist, Lead Wikimedia Foundation

3 4

Upcoming Research Newsletter: New Papers Open For Review
by Mohammed Sadat Abdulai 24 Jan '19

24 Jan '19

Hi everyone, We’re preparing for the January 2019 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201901 and add your name next to any paper you are interested in covering. Our target publication date is on January 31 UTC although actual publication might happen several days later. As usual, short notes and one-paragraph reviews are most welcome. Highlights from this month: • "Anonymous calling": The WikiScanner scandals and anonymity on the Japanese Wikipedia • A Historical Perspective on Information Systems: A Tool and Methodology for Studying the Evolution of Social Representations on Wikipedia • An inferior source? Quantitatively analyzing the production and revision of five technology-enhanced learning-related terms on Wikipedia • Asthma Information Seeking via Wikipedia Between 2015 and 2018: Implications for Awareness Promotion • Can deep learning techniques improve classification performance of vandalism detection in Wikipedia? • Feature Analysis for Assessing the Quality of Wikipedia Articles through Supervised Classification • Following the Fukushima Disaster on (and against) Wikipedia: A Methodological Note about STS Research and Online Platforms • Hacking History: Redressing Gender Inequities on Wikipedia Through an Editathon • Improving New Editor Retention on Wikipedia • Let’s Talk About Refugees: Network Effects Drive Contributor Attention to Wikipedia Articles About Migration-Related Topics • Pharmacy students can improve access to quality medicines information by editing Wikipedia articles • The digital knowledge economy index: mapping content production • Towards Compiling Textbooks from Wikipedia • Trolls, bans and reverts: simulating Wikipedia • Why the World Reads Wikipedia: Beyond English Speakers • Wizard of Wikipedia: Knowledge-Powered Conversational agents Masssly, Tilman Bayer and Dario Taraborelli [1] http://meta.wikimedia.org/wiki/Research:Newsletter

1 0

Re: [Wiki-research-l] Looking for examples of use of Wikipedia and Wikidata in AI
by Lucie-Aimée Kaffee 23 Jan '19

23 Jan '19

Hello John, In the domain of natural language generation, there has been some work around generating text from Wikidata triples (mostly for Wikipedia): In under-resourced languages, domain-independent: https://link.springer.com/chapter/10.1007/978-3-319-93417-4_21 In under-resourced languages, domain-independent: http://aclweb.org/anthology/N18-2101 In English, for the biography domain (Wikidata and DBpedia): https://www.sciencedirect.com/science/article/pii/S1570826818300313 In English, for the biography domain: https://arxiv.org/abs/1702.06235 Since we worked in this field, please reach out if you have questions regarding those publications. Best, Lucie On Wed, 23 Jan 2019 at 13:58, john cummings <mrjohncummings(a)gmail.com> wrote: > Hi all > > I'm putting together an overview of Wikimedia for a conference submission > and their theme is AI. Does anyone have any examples of use of Wikimedia by > AI projects? > > Thanks > > John > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l(a)lists.wikimedia.org > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.wi… > -- Lucie-Aimée Kaffee Web and Internet Science Group School of Electronics and Computer Science University of Southampton

3 2

Looking for examples of use of Wikipedia and Wikidata in AI
by john cummings 23 Jan '19

23 Jan '19

Hi all I'm putting together an overview of Wikimedia for a conference submission and their theme is AI. Does anyone have any examples of use of Wikimedia by AI projects? Thanks John

1 0

Gender gap statistics
by fn＠imm.dtu.dk 23 Jan '19

23 Jan '19

(sorry for cross-posting on wiki-research and wikidata) For an event, I am trying for find statistics about the gender gap. At one point there was a nice website with some information. Now it seems to be gone. https://denelezh.dicare.org/gender-gap.php redirects to https://wdcm.wmflabs.org/WDCM_BiasesDashboard/ but I get "502 Bad Gateway" For http://whgi.wmflabs.org/gender-by-date-of-birth.html I see changes statistics on a web page. (when I view this page it seems as if the CSS is missing) I have trying Wikidata Query Service. I have a few results here: https://www.wikidata.org/wiki/User:Fnielsen/Gender The bad news is that more complex SPARQL queries time out, e.g. Persons across Wikipedias and genders, - I can only do it on Wikisource and Wikiquote. For instance, I find the ratio of female biographies on the Danish Wikipedia to be 17.2% compared to the total number of biographies. It would be interesting to know how this number compares with other Wikipedias. If I include Wikipedia as a parameter in the SPARQL query it times out. (Writing a script could possibly solve it). For WHGI, I see som CSV files. For instance, https://figshare.com/articles/Wikidata_Human_Gender_Indicators/3100903 It reports 9447 and 51774 for women and men, respectively. My SPARQL query does not give these values... Do we have updated statistics on contributor gender, particularly for Danish Wikipedia? I know we have some papers on gender and Wikipedia, see https://tools.wmflabs.org/scholia/topic/Q17002416 "Gender Markers in Wikipedia Usernames" displays 4,6% or 1.2% for females, while "The Wikipedia Gender Gap Revisited: Characterizing Survey Response Bias with Propensity Score Estimation" estimates up to 23%. The old UNU survey https://web.archive.org/web/20110728182835/http://www.wikipediastudy.org/do… seems not to do gender statistics per Wikipedia. https://stats.wikimedia.org seems not to have any gender statistics. -- Finn Årup Nielsen http://people.compute.dtu.dk/faan/

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l January 2019