I’m currently writing by bachelor thesis at University Koblenz, Germany. The goal is to improve Wikipedia search by exploiting the text structure of Wikipedia articles. To conduct unbiased user studies I need real world queries so I can compare the novel algorithms agains the currently used ones. Are there any query logs existing which I can use for this purpose?
Thanks for your help!
WikiConference North America 2016
7-10 October 2016, San Diego, CA, USA
WikiConference North America (formerly WikiConference USA) is the third
annual conference on the North American continent devoted to Wikipedia and
other Wikimedia projects. The weekend will feature both academic and casual
presentations on Wikimedia-related outreach activities, workshops to
improve the skills of grassroots organizers, and discussions on the past,
present, and future of the Wikimedia projects. The conference features
offerings about community outreach, online activity, partnerships with
institutions of knowledge, and technology. Keynote speakers are scheduled
to include Katherine Maher, Executive Director of the Wikimedia Foundation,
and Merrilee Proffitt, Senior Program Officer of OCLC Research. The last
day of the conference will feature programming coinciding with Indigenous
Registration for the conference is now open. You can register at
Scholarships partially covering costs of travel and attendance are
available for active contributors to Wikimedia projects. Apply by August
23rd for scholarships at https://wikiconference.org/wiki/2016/Scholarships.
This is a volunteer run conference and volunteers are needed for any number
of tasks. If you are attending, please consider volunteering for at
We seek presentations addressing topics related to Wikipedia or open access
and culture. Presentations may be from any discipline regarding any
relevant topic. Please submit a description of your proposed presentation
using our online submission process at
https://wikiconference.org/wiki/Submissions. If you are interested in
participating in the peer-reviewed academic track, see our call for
academic submissions at
- Sydney Poore (User:FloNight) and Rosie Stephenson-Goodknight
(User:Rosiestep), conference organizers
My advisor, Prof. Mitra is busy in travels this week. He said he will be
posting to this thread about his thoughts later next week.
Also, one thing he wanted me to mention here is the following:
Although the content in the articles were generated by an algorithm, a
human — I — took those articles and posted them online. We randomly chose
few articles and checked whether any objectionable content was collected
from the web. We planned to remove those before posting on Wikipedia. We
did not create a bot that went and created the articles randomly. We
generated the content offline and then copy-pasted the content of randomly
selected articles. While objectionable content was decided to be removed,
we did not make any changes to sentences anywhere other than that because
that would void checking for linguistic consistency -- which was our soul
purpose. Also, it was done in 'good faith' and hence we just worked on bare
minimum articles to get an idea , not let a bot create random junk. Our
algo does not have the capability of judging whether the cited references
(when we search on google) are reliable or not, but we thought that
reviewers on Wikipedia would remove content from such links as well as
references if they are unreliable. While some references were removed
because of such reasons (eg https://en.wikipedia.org/wiki/Atripliceae),
there were some articles removed saying promotional content (which, as
well, our algo cannot really determine).
Thanks for the comments here, we will keep them in mind if we do anything
similar to this in the future, and I will try to inform other researchers
who work in this area.
As I have mentioned earlier, this is not the first work on article
generation. This is one of the first work we know:
All these did not mention anything about human subjects as finally no
personal information is used (about the person, who is deleting, etc). Nor
did any reviewers/attendees in the conferences in this area question on
is relevant here as it talks about our previous work.
if "record of someone doing something" is relevant from human subjects
point of view, any data on Wikipedia can be used to find the editors (if
not the real person). For example:
I have met several researchers who work using data (revisions from
Wikipedia) and nothin on IRB ever came up.
Nevertheless, as I said, if there are concrete rules, I think it would help
the research community as a whole to know what can or cannot be done and
also ask for permissions.
I appreciate the suggestions that Stuart mentioned in a previous email abut
experimenting on would be deleted or articles lacking sources. But, as of
now we are not planning anything and if we do, we would for sure get in
touch with Denny (who had a video chat with me before starting this thread)
and would try to know the best ways of doing it.
I have asked my PhD advisor (other author on the paper) to check this
thread and he will be able to give more inputs as I am not very qualified
to comment on these aspects.
iConference 2017 | Effect • Expand • Evolve: Global Collaboration Across the Information Community
March 22-25, 2017
Conference website: http://ischools.org/the-iconference/
Conference submissions site: https://www.conftool.com/iConference2017/
We are now accepting submissions for iConference 2017, our twelfth annual gathering of scholars, researchers and professionals who share an interest in the critical information issues of contemporary information society.
iConference 2017 takes place March 22-25, 2017, in Wuhan, China. The theme of this first-ever Asia-based iConference is “Effect • Expand • Evolve: Global Collaboration Across the Information Community.”
Authors and organizers can now submit materials using our secure submissions website: https://www.conftool.com/iConference2017/. The official proceedings will be published in the open access Illinois Digital Environment for Access to Learning and Scholarship (IDEALS). The submissions deadline is September 16, 2016.
iConference 2017 is jointly hosted by the Wuhan University School of Information Management and Korea’s Sungkyunkwan University Library & Information Science and Data Science Department. The 3,500-year-old city of Wuhan represents a combination of ancient culture and modern living, and conference participants are assured a memorable and rewarding experience.
As always, the iConference will include peer-reviewed papers, posters, workshops and sessions for interaction and engagement, interspersed with multiple opportunities for networking. Early career and next generation researchers can engage in the Doctoral Student Colloquium and Early Career Colloquium.
New this year are special conference programs focused on iSchool Best Practices, and also on iSchools and Industry Partnership. In addition, there will be a special track for papers originating in China.
The iConference brings together scholars and researchers addressing critical information issues in contemporary society. The iConference pushes the boundaries of information studies, explores core concepts and ideas, and creates new technological and conceptual configurations—all shaping interdisciplinary discourses. Affiliation with a member iSchool is not required—all information scholars, researchers, and practitioners are encouraged to make submissions. Visit our website for more information, including sample topics and links to past proceedings: http://ischools.org/the-iconference/
The iConference is presented by the iSchools organization (www.ischools.org), a worldwide association of information schools dedicated to advancing the information field, and preparing students to meet the information challenges of the 21st Century. The event is sponsored by Microsoft Research, and other sponsorships are available.
* Conference: http://ischools.org/the-iconference/
* Submissions: https://www.conftool.com/iConference2017/
* Past Proceedings: http://ischools.org/the-iconference/about-the-iconference/
* Facebook: IConference: https://www.facebook.com/IConference
* Twitter: @iConf | #iconf17
All submissions must be in English using our official template. All work should be original and not previously published. Complete guidelines can be found on the Author Instructions page: http://ischools.org/the-iconference/program/author-instructions/
All submissions are due by September 16, 2016.
We invite papers falling into two categories: completed research or early work/preliminary results. Each paper will be refereed in a double-blind process. The author(s) of the completed research paper judged the best of the iConference will receive the Lee Dirks Award for Best Paper and $5,000.
Submission deadline: September 16, 2016
Papers Chairs: Jian Qin, Syracuse University; Mika Grundstrom, University of Tampere; Jevin West, University of Washington
We welcome submission of posters presenting new work, preliminary results and designs, or educational projects. Posters will undergo a double-blind review, and accepted abstracts will be published in the proceedings.
Submission deadline: September 16, 2016
Posters Chairs: Ann-Sofie Axelsson, University of Boras; Min Song, Yonsei University; Esfandiar Haghverdi, Indiana University
This year’s workshops will be half-day only, and are intended to foster interactive discussions focusing on the particular topic within the purview of the iSchools, namely, the relationships among information, people and technology. Workshops provide a great opportunity for attendees who share common interests and want to have intensive discussions.
Submission deadline: September 16, 2016
Workshops Chairs: Noa Aharony, Bar-Ilan University; Daqing He, University of Pittsburgh; Fred Ye, Nanjing University
* SESSIONS FOR INTERACTION AND ENGAGEMENT (SIE)
These sessions provide an excellent opportunity to present ideas, facilitate discussions, and foster knowledge-sharing in unconventional ways. Formats can include panels, fishbowls, performances, storytelling, roundtable discussions, wildcard sessions, demos/exhibitions, and more. All should be highly participatory, informal, engaging, and pluralistic.
Submission deadline: September 16, 2016
SIE Chairs: Ryan Shaw, University of North Carolina; Andrew Cox, University of Sheffield; Jin Ha Lee, University of Washington
* DOCTORAL COLLOQUIUM
The Doctoral Colloquium provides doctoral students the opportunity to present their work to senior faculty and engage with one another in a setting that is relatively informal but that allows for the fullest of intellectual exchanges. Students receive feedback on their dissertation, career paths, and other areas from participating faculty and student peers.
Application deadline: September 16, 2016
Doctoral Colloquium Chairs: Kevin Crowston, Syracuse University; Bella Jing Zhang, Sun Yat-sen University; Pia Borlund, University of Copenhagen; Jiangping Chen, University of North Texas
* DOCTORAL DISSERTATION AWARD
Recognizing the outstanding dissertation of the preceding year, this competition is open to all member iSchools. Each school may submit one dissertation for consideration. The winner will receive a cash prize of $2,500, the runner up $1,000; both will be honored at the iConference.
Submission deadline: September 16, 2016
Dissertation Award Chairs: Michael Seadle, Humboldt University of Berlin; Soo Young Rieh, University of Michigan; Masooda Bashir, University of Illinois
* EARLY CAREER COLLOQUIUM
This half-day event is intended for assistant professors, post-docs, or others in pre-tenure positions and builds on the tradition of highly successful events at past iConferences. Participants will sign up at registration.
Early Career Colloquium Chairs: Marcia Zeng, Kent State University; Kathy Burnett, Florida State University; Antonio Soares, University of Porto
THE FOLLOWING ARE NEW THIS YEAR
* SPECIAL TRACK: CHINESE PAPERS
This special track for 2017 aims to provide excellent opportunities for scientists, scholars and participants throughout China to present their latest research results and to exchange views or experience not only in Library and Information Science, but also in other related areas. The theme is “Data Management and Services.”
Submission Deadline: September 16, 2016
Chinese Papers Chairs: Ruhua Huang, Wuhan University; Wei Lu, Wuhan University
* SPECIAL PROGRAM: iSCHOOL BEST PRACTICES
This special session will focus on issues pertaining to curriculum/teaching/student experience. iSchools are invited to submit proposals and case studies to be considered for inclusion in this special program.
iSchool Best Practices Chairs: Gary Marchionini, University of North Carolina; Shigeo Sugimoto, University of Tsukuba; Ann-Sofie Axelsson, University of Boras
* SPECIAL PROGRAM: iSCHOOLS AND INDUSTRY PARTNERSHIP
New this year, the iConference will present a number of sessions focused on successful industry partnerships of iSchools worldwide. iSchools are invited to submit proposals in the form of case studies to be considered for inclusion in this special program. http://ischools.org/the-iconference/program/ischools-and-industry-partnersh…
iSchools and Industry Partnership Chairs: Sean McGann, University of Washington; Gobinda Chowdhury, Northumbria University; Joon Lee, Seoul National University
iConference 2017 Chairs: Qing Fang, Wuhan University; Sam Oh, Sungkyunkwan University
Program Chairs: Wonsik Jeff Shim, Sungkyunkwan University; Chuanfu Chen, Wuhan University
Local Arrangements Chairs: Lihong Zhou, Wuhan University; Fei Wang, Wuhan University
Proceedings Chairs: Wayne de Fremery, Sogang University; Preben Hansen, Stockholm University
Alessandro Acquisti, Carnegie Mellon University
Denice Adkins, University of Missouri
Naresh Agarwal, Simmons
Jim Andrews, University of South Florida
Ann-Sofie Axelsson , University of Boras
Tim Baldwin, University of Melbourne
Dania Bilal, University of Tennessee, Knoxville
Leanne Bowler, University of Pittsburgh
Josep Cobarsi Morales, Open University of Catalonia
Constantinos Coursaris, Michigan State University
Gianluca Demartini, University of Sheffield
Stephen Downie, University of Illinois
Karen Fisher, University of Washington
Paul Gandel, Syracuse University
Lisa Given, Charles Sturt University
Sara Grimes, University of Toronto
Lorna Hughes, University of Glasgow
Jaap Kamps, University of Amsterdam
Robert Kauffman, Singapore Management University
Diane Kelly, University of North Carolina-Chapel Hill
Kyung-Sun “Sunny” Kim, University of Wisconsin-Madison
Young Seek Kim, University of Kentucky
Lars Konzack, University of Copenhagen
Andy Koronios, University of South Australia
Serap Kurbanoğlu, Hacettepe University
Philippe Lenca, Télécom Bretagne
Claire R. McInerney, Rutgers University
Julie McLeod, University of Northumbria at Newcastle
Ruth Nalumaga, Makerere University
Sanghee Oh, Florida State University
Virginia Ortíz-Repiso Jiménez, Universidad Carlos III de Madrid
Jung-ran Park, Drexel University
Vivien Petras, Humboldt-Universitat zu Berlin
Nils Pharo, University College Oslo and Akershus
Peter Reid, Robert Gordon University Aberdeen
David Reitter, Penn State University
Howard Rosenbaum, Indiana University
Reijo Savolainen, University of Tampere
Kalpana Shankar, University College Dublin
Elizabeth Shepherd, University College London
Antonio Lucas Soares, University of Porto
Martin Souček, Charles University in Prague
Eduardo Vendrell Vidal, Polytechnic University of Valencia
Iris Xie, University of Wisconsin – Milwaukee
Yong Jeong Yi, Sungkyunkwan University
Yan Zhang, University of Texas Austin
Lihong Zhou, Wuhan University
More at: http://ischools.org/the-iconference/
Kevin Crowston | Distinguished Professor of Information Science | School of Information Studies
348 Hinds Hall
Syracuse, New York 13244
t (315) 443.1676 f 315.443.5806 e crowston(a)syr.edu <mailto:email@example.com>
As I mentioned earlier, I was not sure about the multiple account policy. I
got the notification about the incident being raised, and I will be happy
with whatever decisions Wiki administrators make.
As Denny mentioned, we did not plan anything large-scale but only for a
small group of edits. Furthermore, we mentioned the results only being
valid until a particular date before the submission of that conference
paper and things may have already changed a lot (articles removed, edited
further, etc). We have not made any additions since Feb, nor do we plan to
do anything further. Whatever we do, would be offline.
To Denny's point about other researchers trying to do the same kind of
research, I do see research in this area coming up and it might make sense
to have certain rules (although I do not have much idea on how these things
work abt rules on Wiki in general.) I know this because some researchers
have contacted me previously on this work, and they are also looking into
similar areas. One example in this area of work is the following -- this is
Regarding human subjects, no reviewers in the conferences as well as any
other person from Wikimedia mentioned anything on that earlier. Our
previous works were featured earlier on Wikimedia newsletters (links in
earlier emails) and still nothing on it was mentioned nor we found any
information on Wikipedia in general about it. As per the requirements,
approval would be necessary if: *Data about living individuals through
intervention or interaction or **Identifiable private information about
living individuals. *As is mentioned. the "about" fact is very imp --
because nothing about editors data was used or collected in the research.
If rules do change, I will keep following the thread and also please let me
know -- I will try to inform to all researchers who work in this area if
they get in touch with me.
I am the first author of the paper that Denny has referred. Firstly, I want
to thank Denny for asking me to join this list and know more about this
1. Regarding quality, we know that there are issues, and even in the
conference, I have repeatedly told the audience that I am not satisfied
with the quality of the content generated. However, the percentage of
articles that were not removed when the paper was submitted was minimal. I
have sent Denny a list of accounts that were used and it might have been
possible that several articles created have been removed from those
accounts within the last couple of months. I was not aware of the multiple
2. The area of Wikipedia article generation have been explored by others in
the past. [http://www.aclweb.org/anthology/P09-1024,
http://wwwconference.org/proceedings/www2011/companion/p161.pdf] We were
not aware of any rules regarding these sort of experiments. However, we do
understand that such experiments can harm the general quality of this great
encyclopedic resource, hence we did out analysis on bare minimum articles.
In fact, we did our initial work on it back in 2014, and Wikimedia research
even covered details about our paper here --
If questions were raised at that point, we would surely not have done
anything further on this, or rather do things offline without creating or
adding any content on Wikipedia.
I understand your point about imposing rules and I think it makes sense.
However, during this research, we were not aware of any rules, hence
continued our work.
As I have told Denny, our purpose was to check whether we could create bare
minimal articles which could be eventually improved by authors on
Wikipedia, and also to see if they are totally removed. But, it was done
with a few articles and we did not create anything beyond that point. Also,
we did not do any manual modifications to the articles although we saw
quality issues because it would void our analysis and claims.
Thanks everyone for your time and the great work you are doing for the