Wiki-research-l August 2016

wiki-research-l@lists.wikimedia.org

38 participants
36 discussions

Re: [Wiki-research-l] Research on automatically created articles
by siddhartha banerjee 11 Aug '16

11 Aug '16

Hello Everyone, I am the first author of the paper that Denny has referred. Firstly, I want to thank Denny for asking me to join this list and know more about this discussion. 1. Regarding quality, we know that there are issues, and even in the conference, I have repeatedly told the audience that I am not satisfied with the quality of the content generated. However, the percentage of articles that were not removed when the paper was submitted was minimal. I have sent Denny a list of accounts that were used and it might have been possible that several articles created have been removed from those accounts within the last couple of months. I was not aware of the multiple account policy. 2. The area of Wikipedia article generation have been explored by others in the past. [http://www.aclweb.org/anthology/P09-1024, http://wwwconference.org/proceedings/www2011/companion/p161.pdf] We were not aware of any rules regarding these sort of experiments. However, we do understand that such experiments can harm the general quality of this great encyclopedic resource, hence we did out analysis on bare minimum articles. In fact, we did our initial work on it back in 2014, and Wikimedia research even covered details about our paper here -- https://blog.wikimedia.org/2015/02/02/wikimedia-research-newsletter-january… If questions were raised at that point, we would surely not have done anything further on this, or rather do things offline without creating or adding any content on Wikipedia. I understand your point about imposing rules and I think it makes sense. However, during this research, we were not aware of any rules, hence continued our work. As I have told Denny, our purpose was to check whether we could create bare minimal articles which could be eventually improved by authors on Wikipedia, and also to see if they are totally removed. But, it was done with a few articles and we did not create anything beyond that point. Also, we did not do any manual modifications to the articles although we saw quality issues because it would void our analysis and claims. Thanks everyone for your time and the great work you are doing for the Wikipedia community. Regards, Sidd

6 7

The Wikimedia Research Newsletter 6(7) is out
by Mohammed Sadat 10 Aug '16

10 Aug '16

The July 2016 issue of the Wikimedia Research Newsletter is out: https://blog.wikimedia.org/2016/08/10/research-newsletter-july-2016/ https://meta.wikimedia.org/wiki/Research:Newsletter/2016/July In this issue: 1 Making it easier to navigate within article networks via better wikilinks 2 Briefly 2.1 New research journal about Wikipedia and higher education 2.2 Conferences and events *** 19 recent publications were covered or listed in this issue *** Thanks to Jonathan Morgan for contributing. Masssly, Tilman Bayer and Dario Taraborelli --- Wikimedia Research Newsletter https://meta.wikimedia.org/wiki/Research:Newsletter/ * Follow us on Twitter: @WikiResearch * Receive this newsletter by mail: https://lists.wikimedia.org/mailman/listinfo/research-newsletter * Subscribe to the RSS feed: http://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed/

1 0

Research on automatically created articles
by Denny Vrandečić 09 Aug '16

09 Aug '16

Hi all, I found a paper at IJCAI 2016, which left me quite curious: https://siddbanpsu.github.io/publications/ijcai16-banerjee.pdf In short, they find red links, classify them, find the closest similar articles, use the section titles from these articles to decide on sections, search for content for the sections, paraphrase it, and write complete Wikipedia articles. Then they uploaded the articles to Wikipedia, and from the 50 uploaded articles, only 3 got deleted. The rest stayed. I was rather excited when I heard that - where the articles really that good? Then I took a look at the articles and... well, judge for yourself. The paper only mentions three articles of the 47 survivors: https://en.wikipedia.org/wiki/Dick_Barbour https://en.wikipedia.org/wiki/Atripliceae (here is the last version as created by the bot before significant human clean-up: https://en.wikipedia.org/w/index.php?title=Atripliceae&oldid=697456858 ) https://en.wikipedia.org/wiki/Talonid I have connected with the first author and he promised me to give a list of all articles as soon as he can get it, which will be in a few weeks because he is away from his university computer right now. He was able to produce one more article though: https://en.wikipedia.org/wiki/Sonia_Bianchetti_Garbato (Also, see history for the extent of human clean-up) I am not writing to talk badly about the authors or about the reviewing practice at IJCAI, or about the state of research in that area. Also, I really do not want to discourage research in this area. I have a few questions, though: 1) the fact that so many of these articles have survived for half a year indicates that there are some problems with our review processes. Does someone want to make an investigation why these articles survived in the given state? 2) as far as I know we don't have rules for this kind of experiments, but maybe we should. In particular, I feel, that, BLPs should not be created by an experimental approach like this one. Should we set up rules for this kind of experiments? 3) Wikipedia contributors are participating in these experiments without consent. I find that worrysome, and would like to hear what others think. I have invited the first author to join this list. I understand the motivation: by exposing from the beginning that these articles were created by bots, they would have been scrutinized differently than articles written by humans. Therefore they remained quiet about the fact (but are willing to reveal it now, now that the experiment is over - they also explicitly don't have any intentions of expanding the scope of the experiment at the given point of time). Cheers, Denny

4 3

Contribute to the SEMANTiCS Workshops and the DBpedia Day
by Sebastian Hellmann 09 Aug '16

09 Aug '16

SEMANTiCS 2016 - The Linked Data Conference Workshops, Tutorials and the DBpedia Day 12th International Conference on Semantic Systems Leipzig, Germany September 12 -15, 2016 _http://2016.semantics.cc/_ *Workshops/Tutorials * This year's SEMANTiCS is starting on September 12th with a full day of exciting and interesting satellite events. In _6 parallel tracks_ <http://2016.semantics.cc/satellite-events> scientific and industrial workshops and tutorials are scheduled to provide a forum for groups of researchers and practitioners to discuss and learn about hot topics in Semantic Web research. Attending the SEMANTiCS workshops and tutorial is _free of charge_, but you need to register. Feel free to have a closer look and register for the events here: _http://2016.semantics.cc/satellite-events_. *DBpedia Day - Call for Participation * Following our successful meetings in Europe & US our next DBpedia meeting will be held at Leipzig on September 15th, co-located with SEMANTiCS. _Highlights_ - Keynote #1: Wikidata: bringing structured data to Wikipedia with 16000 volunteers by Lydia Pintscher, product manager of Wikidata - Keynote #2: Harald Sack, (title TBA) (Hasso-Plattner-Institut) - A session for the “_DBpedia references and citations challenge_ <http://wiki.dbpedia.org/ideas/idea/261/dbpedia-citations-reference-challeng…>” - A _session on DBpedia ontology_ <http://mappings.dbpedia.org/index.php/DBpedia_Ontology_Committee> by members of the DBpedia ontology committee - Tell us what cool things you do with DBpedia:<https://goo.gl/AieceU>_https://goo.gl/AieceU_ - As always, there will be tutorials to learn about DBpedia and a DBpedia showcase session _Quick facts_ - Web URL: _http://wiki.dbpedia.org/meetings/Leipzig2016_ - When: September 15th, 2016 - Where: University of Leipzig, Augustusplatz 10, 04109 Leipzig - Call for Contribution: _https://goo.gl/AieceU_ (submission form) - Registration: Free to participate but only through registration (Option for DBpedia support tickets)<https://event.gg/3396-7th-dbpedia-community-meeting-in-leipzig-2016>_https://event.gg/3396-7th-dbpedia-community-meeting-in-leipzig-2016_ We are looking forward to your contributions and to seeing you at the SEMANTiCS in Leipzig!

1 0

[CfP] WSDM Cup 2017: Knowledge Base Quality and Search
by Martin Potthast 04 Aug '16

04 Aug '16

------------------------------------------------------------------------------- WSDM Cup 2017: Call for Participation ------------------------------------------------------------------------------- We invite you to take part in one of the following shared tasks: Task 1. Vandalism Detection -- Given a Wikidata revision, is it damaging? This task is about detecting vandalism as well as all other kinds of damaging edits to Wikidata. In doing so, not only Wikidata's integrity is protected, but also that of all information systems making use of the knowledge base. Task 2. Triple Scoring -- Compute relevance scores for triples from type-like relations. For example, the triple "Johnny_Depp profession Actor" should get a high score, because acting is Depp's main profession, whereas "Quentin_Tarantino profession Actor" should get a low score, because Tarantino is more of a director than an actor. Such scores are a basic ingredient for ranking results in entity search. Learn more at http://www.wsdm-cup-2017.org Register now at https://goo.gl/forms/JaVQwFFewLtVFCik2 ------------------------------------------------------------------------------- Important Dates ------------------------------------------------------------------------------- now open Registration Sep 1, 2016 Training data release Dec 8, 2016 Final software submission Dec 22, 2016 Announcement of evaluation results Jan 5, 2017 Paper submission Feb 6-10, 2017 Conference and WSDM Cup workshop All deadlines are 11:59 PM, anywhere on earth (AoE). ------------------------------------------------------------------------------- Special Announcements ------------------------------------------------------------------------------- Evaluation as a Service. For the sake of reproducability, we ask you to submit your software instead of just its run output. Software submissions allow for preserving your software in working condition, and for re-evaluating it as new datasets appear. To facilitate software submissions, we will make use of the cloud-based evaluation platform TIRA (www.tira.io). Open Source Proceedings. We encourage the open source release of your software. To maximize the impact of your software, we collect it at a central repository on GitHub: https://github.com/wsdm-cup-2017 Private repositories can be assigned to you at request during the competition. Benefits for early birds. Submitting your software or your notebook early, as well as registering early for the conference will be rewarded. Check out the specific benefits on our web page at http://www.wsdm-cup-2017.org

1 0

Alternatives to google forms for surveys
by Jan Dittrich 03 Aug '16

03 Aug '16

Hello Research, I would like to setup surveys at Wikimedia Deutschland. Currently we have google forms as a possiblity. We wonder if there are any other solutions which are ideally open source and/or self hostable. What I found was https://www.limesurvey.org/de/ – I wonder if anybody has experiences with that. Jan -- Jan Dittrich UX Design/ User Research Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin Phone: +49 (0)30 219 158 26-0 http://wikimedia.de Imagine a world, in which every single human being can freely share in the sum of all knowledge. That‘s our commitment. Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

6 7

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l August 2016