Hi,
My advisor, Prof. Mitra is busy in travels this week. He said he will be
posting to this thread about his thoughts later next week.
Also, one thing he wanted me to mention here is the following:
Although the content in the articles were generated by an algorithm, a
human — I — took those articles and posted them online. We randomly chose
few articles and checked whether any objectionable content was collected
from the web. We planned to remove those before posting on Wikipedia. We
did not create a bot that went and created the articles randomly. We
generated the content offline and then copy-pasted the content of randomly
selected articles. While objectionable content was decided to be removed,
we did not make any changes to sentences anywhere other than that because
that would void checking for linguistic consistency -- which was our soul
purpose. Also, it was done in 'good faith' and hence we just worked on bare
minimum articles to get an idea , not let a bot create random junk. Our
algo does not have the capability of judging whether the cited references
(when we search on google) are reliable or not, but we thought that
reviewers on Wikipedia would remove content from such links as well as
references if they are unreliable. While some references were removed
because of such reasons (eg https://en.wikipedia.org/wiki/Atripliceae),
there were some articles removed saying promotional content (which, as
well, our algo cannot really determine).
Thanks for the comments here, we will keep them in mind if we do anything
similar to this in the future, and I will try to inform other researchers
who work in this area.
Thanks,
Sidd
Thanks Kerry. In part 2, when I mention that the waiver should be stamped,
I implied applying for IRB and then getting a stamp. Sorry for not being
clear.
Thanks,
Sidd
As I have mentioned earlier, this is not the first work on article
generation. This is one of the first work we know:
https://people.csail.mit.edu/csauper/pubs/sauper-sm-thesis.pdfhttps://people.csail.mit.edu/regina/my_papers/wiki.pdf
All these did not mention anything about human subjects as finally no
personal information is used (about the person, who is deleting, etc). Nor
did any reviewers/attendees in the conferences in this area question on
this aspect.
Also,
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-01-28/Recen…
is relevant here as it talks about our previous work.
if "record of someone doing something" is relevant from human subjects
point of view, any data on Wikipedia can be used to find the editors (if
not the real person). For example:
https://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/viewFile/3505/3968https://alchemy.cs.washington.edu/papers/wu08/wu08.pdf
I have met several researchers who work using data (revisions from
Wikipedia) and nothin on IRB ever came up.
Nevertheless, as I said, if there are concrete rules, I think it would help
the research community as a whole to know what can or cannot be done and
also ask for permissions.
I appreciate the suggestions that Stuart mentioned in a previous email abut
experimenting on would be deleted or articles lacking sources. But, as of
now we are not planning anything and if we do, we would for sure get in
touch with Denny (who had a video chat with me before starting this thread)
and would try to know the best ways of doing it.
I have asked my PhD advisor (other author on the paper) to check this
thread and he will be able to give more inputs as I am not very qualified
to comment on these aspects.
Thanks,
Sidd
iConference 2017 | Effect • Expand • Evolve: Global Collaboration Across the Information Community
Wuhan, China
March 22-25, 2017
Conference website: http://ischools.org/the-iconference/
Conference submissions site: https://www.conftool.com/iConference2017/
We are now accepting submissions for iConference 2017, our twelfth annual gathering of scholars, researchers and professionals who share an interest in the critical information issues of contemporary information society.
iConference 2017 takes place March 22-25, 2017, in Wuhan, China. The theme of this first-ever Asia-based iConference is “Effect • Expand • Evolve: Global Collaboration Across the Information Community.”
Authors and organizers can now submit materials using our secure submissions website: https://www.conftool.com/iConference2017/. The official proceedings will be published in the open access Illinois Digital Environment for Access to Learning and Scholarship (IDEALS). The submissions deadline is September 16, 2016.
iConference 2017 is jointly hosted by the Wuhan University School of Information Management and Korea’s Sungkyunkwan University Library & Information Science and Data Science Department. The 3,500-year-old city of Wuhan represents a combination of ancient culture and modern living, and conference participants are assured a memorable and rewarding experience.
As always, the iConference will include peer-reviewed papers, posters, workshops and sessions for interaction and engagement, interspersed with multiple opportunities for networking. Early career and next generation researchers can engage in the Doctoral Student Colloquium and Early Career Colloquium.
New this year are special conference programs focused on iSchool Best Practices, and also on iSchools and Industry Partnership. In addition, there will be a special track for papers originating in China.
The iConference brings together scholars and researchers addressing critical information issues in contemporary society. The iConference pushes the boundaries of information studies, explores core concepts and ideas, and creates new technological and conceptual configurations—all shaping interdisciplinary discourses. Affiliation with a member iSchool is not required—all information scholars, researchers, and practitioners are encouraged to make submissions. Visit our website for more information, including sample topics and links to past proceedings: http://ischools.org/the-iconference/
The iConference is presented by the iSchools organization (www.ischools.org), a worldwide association of information schools dedicated to advancing the information field, and preparing students to meet the information challenges of the 21st Century. The event is sponsored by Microsoft Research, and other sponsorships are available.
IMPORTANT LINKS
* Conference: http://ischools.org/the-iconference/
* Submissions: https://www.conftool.com/iConference2017/
* Past Proceedings: http://ischools.org/the-iconference/about-the-iconference/
* Facebook: IConference: https://www.facebook.com/IConference
* Twitter: @iConf | #iconf17
SUBMISSION INFORMATION
All submissions must be in English using our official template. All work should be original and not previously published. Complete guidelines can be found on the Author Instructions page: http://ischools.org/the-iconference/program/author-instructions/
All submissions are due by September 16, 2016.
* PAPERS
We invite papers falling into two categories: completed research or early work/preliminary results. Each paper will be refereed in a double-blind process. The author(s) of the completed research paper judged the best of the iConference will receive the Lee Dirks Award for Best Paper and $5,000.
http://ischools.org/the-iconference/program/papers/
Submission deadline: September 16, 2016
Papers Chairs: Jian Qin, Syracuse University; Mika Grundstrom, University of Tampere; Jevin West, University of Washington
* POSTERS
We welcome submission of posters presenting new work, preliminary results and designs, or educational projects. Posters will undergo a double-blind review, and accepted abstracts will be published in the proceedings.
http://ischools.org/the-iconference/program/posters/
Submission deadline: September 16, 2016
Posters Chairs: Ann-Sofie Axelsson, University of Boras; Min Song, Yonsei University; Esfandiar Haghverdi, Indiana University
* WORKSHOPS
This year’s workshops will be half-day only, and are intended to foster interactive discussions focusing on the particular topic within the purview of the iSchools, namely, the relationships among information, people and technology. Workshops provide a great opportunity for attendees who share common interests and want to have intensive discussions.
http://ischools.org/the-iconference/program/workshops/
Submission deadline: September 16, 2016
Workshops Chairs: Noa Aharony, Bar-Ilan University; Daqing He, University of Pittsburgh; Fred Ye, Nanjing University
* SESSIONS FOR INTERACTION AND ENGAGEMENT (SIE)
These sessions provide an excellent opportunity to present ideas, facilitate discussions, and foster knowledge-sharing in unconventional ways. Formats can include panels, fishbowls, performances, storytelling, roundtable discussions, wildcard sessions, demos/exhibitions, and more. All should be highly participatory, informal, engaging, and pluralistic.
http://ischools.org/the-iconference/program/sessions-for-interaction-and-en…
Submission deadline: September 16, 2016
SIE Chairs: Ryan Shaw, University of North Carolina; Andrew Cox, University of Sheffield; Jin Ha Lee, University of Washington
* DOCTORAL COLLOQUIUM
The Doctoral Colloquium provides doctoral students the opportunity to present their work to senior faculty and engage with one another in a setting that is relatively informal but that allows for the fullest of intellectual exchanges. Students receive feedback on their dissertation, career paths, and other areas from participating faculty and student peers.
http://ischools.org/the-iconference/program/doctoral-colloquium/
Application deadline: September 16, 2016
Doctoral Colloquium Chairs: Kevin Crowston, Syracuse University; Bella Jing Zhang, Sun Yat-sen University; Pia Borlund, University of Copenhagen; Jiangping Chen, University of North Texas
* DOCTORAL DISSERTATION AWARD
Recognizing the outstanding dissertation of the preceding year, this competition is open to all member iSchools. Each school may submit one dissertation for consideration. The winner will receive a cash prize of $2,500, the runner up $1,000; both will be honored at the iConference.
http://ischools.org/the-iconference/program/dissertation-award/
Submission deadline: September 16, 2016
Dissertation Award Chairs: Michael Seadle, Humboldt University of Berlin; Soo Young Rieh, University of Michigan; Masooda Bashir, University of Illinois
* EARLY CAREER COLLOQUIUM
This half-day event is intended for assistant professors, post-docs, or others in pre-tenure positions and builds on the tradition of highly successful events at past iConferences. Participants will sign up at registration.
http://ischools.org/the-iconference/program/early-career-colloquium/
Early Career Colloquium Chairs: Marcia Zeng, Kent State University; Kathy Burnett, Florida State University; Antonio Soares, University of Porto
THE FOLLOWING ARE NEW THIS YEAR
* SPECIAL TRACK: CHINESE PAPERS
This special track for 2017 aims to provide excellent opportunities for scientists, scholars and participants throughout China to present their latest research results and to exchange views or experience not only in Library and Information Science, but also in other related areas. The theme is “Data Management and Services.”
http://ischools.org/the-iconference/program/chinese-papers/
Submission Deadline: September 16, 2016
Chinese Papers Chairs: Ruhua Huang, Wuhan University; Wei Lu, Wuhan University
* SPECIAL PROGRAM: iSCHOOL BEST PRACTICES
This special session will focus on issues pertaining to curriculum/teaching/student experience. iSchools are invited to submit proposals and case studies to be considered for inclusion in this special program.
http://ischools.org/the-iconference/program/ischool-best-practices/
iSchool Best Practices Chairs: Gary Marchionini, University of North Carolina; Shigeo Sugimoto, University of Tsukuba; Ann-Sofie Axelsson, University of Boras
* SPECIAL PROGRAM: iSCHOOLS AND INDUSTRY PARTNERSHIP
New this year, the iConference will present a number of sessions focused on successful industry partnerships of iSchools worldwide. iSchools are invited to submit proposals in the form of case studies to be considered for inclusion in this special program. http://ischools.org/the-iconference/program/ischools-and-industry-partnersh…
iSchools and Industry Partnership Chairs: Sean McGann, University of Washington; Gobinda Chowdhury, Northumbria University; Joon Lee, Seoul National University
ADDITIONAL ORGANIZERS
iConference 2017 Chairs: Qing Fang, Wuhan University; Sam Oh, Sungkyunkwan University
Program Chairs: Wonsik Jeff Shim, Sungkyunkwan University; Chuanfu Chen, Wuhan University
Local Arrangements Chairs: Lihong Zhou, Wuhan University; Fei Wang, Wuhan University
Proceedings Chairs: Wayne de Fremery, Sogang University; Preben Hansen, Stockholm University
Program Committee:
Alessandro Acquisti, Carnegie Mellon University
Denice Adkins, University of Missouri
Naresh Agarwal, Simmons
Jim Andrews, University of South Florida
Ann-Sofie Axelsson , University of Boras
Tim Baldwin, University of Melbourne
Dania Bilal, University of Tennessee, Knoxville
Leanne Bowler, University of Pittsburgh
Josep Cobarsi Morales, Open University of Catalonia
Constantinos Coursaris, Michigan State University
Gianluca Demartini, University of Sheffield
Stephen Downie, University of Illinois
Karen Fisher, University of Washington
Paul Gandel, Syracuse University
Lisa Given, Charles Sturt University
Sara Grimes, University of Toronto
Lorna Hughes, University of Glasgow
Jaap Kamps, University of Amsterdam
Robert Kauffman, Singapore Management University
Diane Kelly, University of North Carolina-Chapel Hill
Kyung-Sun “Sunny” Kim, University of Wisconsin-Madison
Young Seek Kim, University of Kentucky
Lars Konzack, University of Copenhagen
Andy Koronios, University of South Australia
Serap Kurbanoğlu, Hacettepe University
Philippe Lenca, Télécom Bretagne
Claire R. McInerney, Rutgers University
Julie McLeod, University of Northumbria at Newcastle
Ruth Nalumaga, Makerere University
Sanghee Oh, Florida State University
Virginia Ortíz-Repiso Jiménez, Universidad Carlos III de Madrid
Jung-ran Park, Drexel University
Vivien Petras, Humboldt-Universitat zu Berlin
Nils Pharo, University College Oslo and Akershus
Peter Reid, Robert Gordon University Aberdeen
David Reitter, Penn State University
Howard Rosenbaum, Indiana University
Reijo Savolainen, University of Tampere
Kalpana Shankar, University College Dublin
Elizabeth Shepherd, University College London
Antonio Lucas Soares, University of Porto
Martin Souček, Charles University in Prague
Eduardo Vendrell Vidal, Polytechnic University of Valencia
Iris Xie, University of Wisconsin – Milwaukee
Yong Jeong Yi, Sungkyunkwan University
Yan Zhang, University of Texas Austin
Lihong Zhou, Wuhan University
More at: http://ischools.org/the-iconference/
Kevin Crowston | Distinguished Professor of Information Science | School of Information Studies
Syracuse University
348 Hinds Hall
Syracuse, New York 13244
t (315) 443.1676 f 315.443.5806 e crowston(a)syr.edu <mailto:crowston@syr.edu>
crowston.syr.edu <http://crowston.syr.edu/>
As I mentioned earlier, I was not sure about the multiple account policy. I
got the notification about the incident being raised, and I will be happy
with whatever decisions Wiki administrators make.
As Denny mentioned, we did not plan anything large-scale but only for a
small group of edits. Furthermore, we mentioned the results only being
valid until a particular date before the submission of that conference
paper and things may have already changed a lot (articles removed, edited
further, etc). We have not made any additions since Feb, nor do we plan to
do anything further. Whatever we do, would be offline.
To Denny's point about other researchers trying to do the same kind of
research, I do see research in this area coming up and it might make sense
to have certain rules (although I do not have much idea on how these things
work abt rules on Wiki in general.) I know this because some researchers
have contacted me previously on this work, and they are also looking into
similar areas. One example in this area of work is the following -- this is
very recent:
http://snap.stanford.edu/wikiworkshop2016/papers/wikiworkshop_icwsm2016_poc…
Regarding human subjects, no reviewers in the conferences as well as any
other person from Wikimedia mentioned anything on that earlier. Our
previous works were featured earlier on Wikimedia newsletters (links in
earlier emails) and still nothing on it was mentioned nor we found any
information on Wikipedia in general about it. As per the requirements,
approval would be necessary if: *Data about living individuals through
intervention or interaction or **Identifiable private information about
living individuals. *As is mentioned. the "about" fact is very imp --
because nothing about editors data was used or collected in the research.
If rules do change, I will keep following the thread and also please let me
know -- I will try to inform to all researchers who work in this area if
they get in touch with me.
-- Siddhartha
Hello Everyone,
I am the first author of the paper that Denny has referred. Firstly, I want
to thank Denny for asking me to join this list and know more about this
discussion.
1. Regarding quality, we know that there are issues, and even in the
conference, I have repeatedly told the audience that I am not satisfied
with the quality of the content generated. However, the percentage of
articles that were not removed when the paper was submitted was minimal. I
have sent Denny a list of accounts that were used and it might have been
possible that several articles created have been removed from those
accounts within the last couple of months. I was not aware of the multiple
account policy.
2. The area of Wikipedia article generation have been explored by others in
the past. [http://www.aclweb.org/anthology/P09-1024,
http://wwwconference.org/proceedings/www2011/companion/p161.pdf] We were
not aware of any rules regarding these sort of experiments. However, we do
understand that such experiments can harm the general quality of this great
encyclopedic resource, hence we did out analysis on bare minimum articles.
In fact, we did our initial work on it back in 2014, and Wikimedia research
even covered details about our paper here --
https://blog.wikimedia.org/2015/02/02/wikimedia-research-newsletter-january…
If questions were raised at that point, we would surely not have done
anything further on this, or rather do things offline without creating or
adding any content on Wikipedia.
I understand your point about imposing rules and I think it makes sense.
However, during this research, we were not aware of any rules, hence
continued our work.
As I have told Denny, our purpose was to check whether we could create bare
minimal articles which could be eventually improved by authors on
Wikipedia, and also to see if they are totally removed. But, it was done
with a few articles and we did not create anything beyond that point. Also,
we did not do any manual modifications to the articles although we saw
quality issues because it would void our analysis and claims.
Thanks everyone for your time and the great work you are doing for the
Wikipedia community.
Regards,
Sidd
Hi all,
I found a paper at IJCAI 2016, which left me quite curious:
https://siddbanpsu.github.io/publications/ijcai16-banerjee.pdf
In short, they find red links, classify them, find the closest similar
articles, use the section titles from these articles to decide on sections,
search for content for the sections, paraphrase it, and write complete
Wikipedia articles.
Then they uploaded the articles to Wikipedia, and from the 50 uploaded
articles, only 3 got deleted. The rest stayed. I was rather excited when I
heard that - where the articles really that good?
Then I took a look at the articles and... well, judge for yourself. The
paper only mentions three articles of the 47 survivors:
https://en.wikipedia.org/wiki/Dick_Barbourhttps://en.wikipedia.org/wiki/Atripliceae (here is the last version as
created by the bot before significant human clean-up:
https://en.wikipedia.org/w/index.php?title=Atripliceae&oldid=697456858 )
https://en.wikipedia.org/wiki/Talonid
I have connected with the first author and he promised me to give a list of
all articles as soon as he can get it, which will be in a few weeks because
he is away from his university computer right now. He was able to produce
one more article though:
https://en.wikipedia.org/wiki/Sonia_Bianchetti_Garbato
(Also, see history for the extent of human clean-up)
I am not writing to talk badly about the authors or about the reviewing
practice at IJCAI, or about the state of research in that area. Also, I
really do not want to discourage research in this area.
I have a few questions, though:
1) the fact that so many of these articles have survived for half a year
indicates that there are some problems with our review processes. Does
someone want to make an investigation why these articles survived in the
given state?
2) as far as I know we don't have rules for this kind of experiments, but
maybe we should. In particular, I feel, that, BLPs should not be created by
an experimental approach like this one. Should we set up rules for this
kind of experiments?
3) Wikipedia contributors are participating in these experiments without
consent. I find that worrysome, and would like to hear what others think.
I have invited the first author to join this list.
I understand the motivation: by exposing from the beginning that these
articles were created by bots, they would have been scrutinized differently
than articles written by humans. Therefore they remained quiet about the
fact (but are willing to reveal it now, now that the experiment is over -
they also explicitly don't have any intentions of expanding the scope of
the experiment at the given point of time).
Cheers,
Denny
SEMANTiCS 2016 - The Linked Data Conference
Workshops, Tutorials and the DBpedia Day
12th International Conference on Semantic Systems
Leipzig, Germany
September 12 -15, 2016
_http://2016.semantics.cc/_
*Workshops/Tutorials *
This year's SEMANTiCS is starting on September 12th with a full day of
exciting and interesting satellite events. In _6 parallel tracks_
<http://2016.semantics.cc/satellite-events> scientific and industrial
workshops and tutorials are scheduled to provide a forum for groups of
researchers and practitioners to discuss and learn about hot topics in
Semantic Web research.
Attending the SEMANTiCS workshops and tutorial is _free of charge_, but
you need to register. Feel free to have a closer look and register for
the events here: _http://2016.semantics.cc/satellite-events_.
*DBpedia Day - Call for Participation *
Following our successful meetings in Europe & US our next DBpedia
meeting will be held at Leipzig on September 15th, co-located with
SEMANTiCS.
_Highlights_
- Keynote #1: Wikidata: bringing structured data to Wikipedia with 16000
volunteers by Lydia Pintscher, product manager of Wikidata
- Keynote #2: Harald Sack, (title TBA) (Hasso-Plattner-Institut)
- A session for the “_DBpedia references and citations challenge_
<http://wiki.dbpedia.org/ideas/idea/261/dbpedia-citations-reference-challeng…>”
- A _session on DBpedia ontology_
<http://mappings.dbpedia.org/index.php/DBpedia_Ontology_Committee> by
members of the DBpedia ontology committee
- Tell us what cool things you do with
DBpedia:<https://goo.gl/AieceU>_https://goo.gl/AieceU_
- As always, there will be tutorials to learn about DBpedia and a
DBpedia showcase session
_Quick facts_
- Web URL: _http://wiki.dbpedia.org/meetings/Leipzig2016_
- When: September 15th, 2016
- Where: University of Leipzig, Augustusplatz 10, 04109 Leipzig
- Call for Contribution: _https://goo.gl/AieceU_ (submission form)
- Registration: Free to participate but only through registration
(Option for DBpedia support
tickets)<https://event.gg/3396-7th-dbpedia-community-meeting-in-leipzig-2016>_https://event.gg/3396-7th-dbpedia-community-meeting-in-leipzig-2016_
We are looking forward to your contributions and to seeing you at the
SEMANTiCS in Leipzig!
-------------------------------------------------------------------------------
WSDM Cup 2017: Call for Participation
-------------------------------------------------------------------------------
We invite you to take part in one of the following shared tasks:
Task 1.
Vandalism Detection -- Given a Wikidata revision, is it damaging?
This task is about detecting vandalism as well as all other kinds of
damaging
edits to Wikidata. In doing so, not only Wikidata's integrity is protected,
but
also that of all information systems making use of the knowledge base.
Task 2.
Triple Scoring -- Compute relevance scores for triples from type-like
relations.
For example, the triple "Johnny_Depp profession Actor" should get a high
score,
because acting is Depp's main profession, whereas "Quentin_Tarantino
profession
Actor" should get a low score, because Tarantino is more of a director than
an
actor. Such scores are a basic ingredient for ranking results in entity
search.
Learn more at http://www.wsdm-cup-2017.org
Register now at https://goo.gl/forms/JaVQwFFewLtVFCik2
-------------------------------------------------------------------------------
Important Dates
-------------------------------------------------------------------------------
now open Registration
Sep 1, 2016 Training data release
Dec 8, 2016 Final software submission
Dec 22, 2016 Announcement of evaluation results
Jan 5, 2017 Paper submission
Feb 6-10, 2017 Conference and WSDM Cup workshop
All deadlines are 11:59 PM, anywhere on earth (AoE).
-------------------------------------------------------------------------------
Special Announcements
-------------------------------------------------------------------------------
Evaluation as a Service.
For the sake of reproducability, we ask you to submit your software instead
of
just its run output. Software submissions allow for preserving your software
in working condition, and for re-evaluating it as new datasets appear.
To facilitate software submissions, we will make use of the cloud-based
evaluation platform TIRA (www.tira.io).
Open Source Proceedings.
We encourage the open source release of your software. To maximize the
impact
of your software, we collect it at a central repository on GitHub:
https://github.com/wsdm-cup-2017
Private repositories can be assigned to you at request during the
competition.
Benefits for early birds.
Submitting your software or your notebook early, as well as registering
early
for the conference will be rewarded. Check out the specific benefits on
our web page at http://www.wsdm-cup-2017.org