Wiki-research-l April 2010

wiki-research-l@lists.wikimedia.org

21 participants
6 discussions

by song＠cs.umn.edu

Pursuant to prior discussions about the need for a research policy on Wikipedia, WikiProject Research is drafting a policy regarding the recruitment of Wikipedia users to participate in studies. At this time, we have a proposed policy, and an accompanying group that would facilitate recruitment of subjects in much the same way that the Bot Approvals Group approves bots. The policy proposal can be found at: http://en.wikipedia.org/wiki/Wikipedia:Research The Subject Recruitment Approvals Group mentioned in the proposal is being described at: http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group Before we move forward with seeking approval from the Wikipedia community, we would like additional input about the proposal, and would welcome additional help improving it. Also, please consider participating in WikiProject Research at: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research -- Bryan Song GroupLens Research University of Minnesota

9 months, 2 weeks

Wikimania 2010: Call for Participation is there!

by Marcin Cieslak

Wikimania is an annual global event devoted to Wikimedia projects around the globe (including Wikipedia, Wikibooks, Wikinews, Wiktionary, Wikispecies, Wikimedia Commons, and MediaWiki). The conference is a community gathering, giving the editors, users and developers of Wikimedia projects an opportunity to meet each other, exchange ideas, report on research and projects, and collaborate on the future of the projects. The conference is open to the public, and is a chance for educators, researchers, programmers and free culture activists who are interested in the Wikimedia projects to learn more and share ideas about the Wikimedia projects. This year's conference will be held JULY 9-11, 2010 in Gdansk, Poland at Polish Baltic Philharmonic. For more information, please visit the official Wikimania 2010 site: http://wikimania2010.wikimedia.org/ Wikimania 2010 will be a mix of submitted talks, open space meetings, birds of a feather groups, and lightning talks. Submissions will be discussed and selected in an informal process on the wiki. If your submission is not added to the schedule, you will still have many opportunities to bring topics forward on-site. IMPORTANT DATES * Deadline for submitting workshop, tutorial, panel and presentation proposals: May 20 * Notification of acceptance: May 25 (workshops), May 31 (panels, tutorials, presentations) * All proposals and presentations will be welcome in the Open Space track of the conference, whether or not they are accepted in this initial process. PROGRAM COMMITTEE Submissions will be reviewed informally by a team of volunteers. TRACKS This year Wikimania will offer three tracks for submissions for members of wiki communities and interested observers to share their own experiences and thoughts and to present new ideas: People and Community The People and Community track provides a unique forum for discussing topics related to people using/building wikis. Relevant topics include, but are not restricted to, the following: * Wiki Community: Conflict resolution and community dynamics; reputation and identity; * Wiki Outreach: Promotion of wikis and Wikimedia projects among the general public; * North meets south, east meets west: How can people of a different cultural background create an encyclopedia according to common rules? Same subject in the eye of different cultures. * Special: Wikipedia in Central/Eastern Europe: this theme will provide a forum to present and discuss the latest progress of Wikis in the central/eastern European community. Knowledge and Collaboration The Knowledge and Collaboration track aims to promote research and find exciting ideas related to knowledge... * Wiki Content: New ways to improve content quality, credibility; legal issues and copyrights (is free knowledge free?); use of the content in education, journalism, research; * Semantic Wikis: The use of semantic web technologies, linked data; semantic annotation and metadata (in particular manual vs. automated approaches). Infrastructure Track The Infrastructure track at Wikimania will provide a forum where both researchers and practitioners can share new approaches, applications, and explore how to make Wiki access ever more ubiquitous: * MediaWiki development: issues related to MediaWiki development and extensions; * Moving beyond MediaWiki: what other Wiki-like platforms exist; what tools and features do we need for collaboration on different types of knowledge? * Mobile Wikis: The Web is moving off the desktop and into mobile phones, how we use wikis on mobile devices?; wiki-based Augmented Reality (AR) applications, location based services * User Interface Design: Usability and user experience; accessibility, adaptive interfaces and personalization; novel UI designs. WIKISYM 2010 Please note that Wikimania 2010 is co-located with WikiSym, The International Symposium on Wikis and Open Collaboration. More information about WikiSym can be found on the conference website: http://www.wikisym.org/ SUBMIT A PROPOSAL To submit a proposal for a presentation, workshop, panel or tutorial, please visit: http://bit.ly/Submit2010 Thank you for helping make Wikimania 2010 a successful event. :-) See you in Gdansk, July 9-11! -- Marcin Cieslak Wikimania 2010 Gdansk

13 years, 12 months

Access to HTTP access logs for Wikipedia articles?

by S. Nunes

Hi all, I presume that Wikipedia keeps data about HTTP accesses to all articles. Can anybody inform me if this data is available for research purposes? I am particularly interested in HTTP referral information for each article. I suspect that this information could be used to estimate topical relevance for each document. Access to this information poses no risk to users' privacy since no user information is made available - sessions' id, hour/minute timestamp data and IPs could be easily discarded. I am new to this list, so I really don't know if this has been previously discussed. I searched the archives and found no relevant results on this issue. Thanks in advance for your feedback, -- Sérgio Nunes

14 years

2nd CfP Special Issue on Collaboratively Constructed Language Resources (LRE Journal)

by Torsten Zesch

Language Resources and Evaluation Journal Special Issue on "Collaboratively Constructed Language Resources" http://www.ukp.tu-darmstadt.de/scientific-community/special-issue-language-… KEYWORDS Wikipedia, Wiktionary, Mechanical Turk, Games with a Purpose, Folksonomies, Twitter, Social Networks INTRODUCTION In recent years, online resources collaboratively constructed by ordinary users on the Web have considerably influenced the language resources community. They have been successfully used for example as a substitute for conventional language resources and as semantically structured corpora. Particularly, knowledge acquisition bottlenecks and coverage problems pertinent to conventional language resources can be overcome by collaboratively constructed resources. The resource that has gained the greatest popularity in this respect so far is Wikipedia. However, other promising resources were recently discovered, such as folksonomies, Twitter, the Wiki dictionary Wiktionary, social Q&A sites like WikiAnswers, approaches based on Mechanical Turk, or game-based approaches. The benefits of using collaboratively constructed resources come along with new challenges, such as the interoperability with existing resources, or the quality of the extracted lexical semantic knowledge. Interoperability between resources is crucial as no single resource provides perfect coverage. The quality of collaboratively constructed resources is a fundamental issue, as they often lack editorial control or contain incomplete entries. These challenges actually present a chance for natural language processing methods to improve the quality of collaboratively constructed resources. Researchers have therefore proposed techniques for link prediction or information extraction that can be used to guide the "crowds" in constructing resources of better quality. TOPICS OF INTEREST Specific topics include but are not limited to: - Analysis of collaboratively constructed resources, such as wiki-based resources, folksonomies, Twitter, or social networks; - Using special features of collaboratively constructed resources to create novel resource types, for example revision-based corpora, simplified versions of resources, etc.; - Analyzing the structure of collaboratively constructed resources related to their use in computational linguistics and language technology; - Interoperability of collaboratively constructed resources with conventional language resources and between themselves; - Mining social and collaborative content for constructing structured language resources (e.g. lexical semantic resources) and the corresponding tools; - Mining multilingual information from collaboratively constructed resources; - Game-based approaches to resource creation; - Mechanical Turk for building language resources; - Quality and reliability of collaboratively constructed language resources. We would also like to welcome papers outlining the challenges related to using collaboratively constructed resources in computational linguistics and language technology, and spanning the cross-disciplinary boundaries to discourse analysis, social network analysis, and artificial intelligence. IMPORTANT DATES Submission deadline: Jul 1, 2010 Preliminary decisions: Oct 1, 2010 Submission of revised articles: Nov 1, 2010 Final versions due: Feb 1, 2011 SUBMISSION GUIDELINES Submissions should be not exceed 30 pages, must be in English, and follow the submission guidelines on the Language Resources and Evaluation Web site http://www.springer.com/education/linguistics/computational+linguistics/jou… Submissions will be reviewed according to the standards of the LRE journal. Papers should not have been submitted or published elsewhere but may be substantially extended or refined versions of conference papers. Substantially extended and revised versions of papers accepted at previous workshops concerned with collaboratively constructed semantic resources, e.g. the ACL 2009 workshop on "Collaboratively Constructed Semantic Resources" or the forthcoming COLING 2010 workshop on the same topic are encouraged. http://www.ukp.tu-darmstadt.de/scientific-community/acl-ijcnlp-2009-worksho… http://www.ukp.tu-darmstadt.de/scientific-community/coling-2010-workshop/ Authors are encouraged to send a brief email to Torsten Zesch (lastname (at) tk.informatik.tu-darmstadt.de) indicating their intention to submit an article as soon as possible, including their contact information and the topic they intend to address in their submissions. To submit papers: - Go to http://www.editorialmanager.com/chum/ - Register and login as an author - Select "SI: Collaboratively Constructed Semantic" as Paper Type - Follow the instructions on the screen GUEST EDITORS Iryna Gurevych and Torsten Zesch UKP Lab Technische Universität Darmstadt http://www.ukp.tu-darmstadt.de PRELIMINARY GUEST EDITORIAL BOARD Further responses are pending and will be announced shortly. Anette Frank Heidelberg University Christiane Fellbaum Princeton University Delphine Bernhard LIMSI-CNRS, Orsay, France Diana McCarthy University of Sussex Graeme Hirst University of Toronto Gregory Grefenstette Exalead, Paris, France György Szarvas Technische Universität Darmstadt Lothar Lemnitzer BBAW, Berlin, Germany Massimo Poesio University of Essex Piek Vossen University Amsterdam, Netherlands Rada Mihalcea University of North Texas Saif Mohammad National Research Council Canada ABOUT THE JOURNAL Language Resources and Evaluation is the first publication devoted to the acquisition, creation, annotation, and use of language resources, together with methods for evaluation of resources, technologies, and applications.

14 years

2nd CfP COLING 2010 - 2nd Workshop on Collaboratively Constructed Semantic Resources

by Torsten Zesch

COLING 2010 2nd Workshop on "The People's Web meets NLP: Collaboratively Constructed Semantic Resources" Beijing August 28th, 2010 http://www.ukp.tu-darmstadt.de/scientific-community/coling-2010-workshop/ Keywords: Wikipedia, Wiktionary, Mechanical Turk, Games with a purpose, Folksonomies, Twitter, Social Networks INTRODUCTION The workshop builds upon the success of the first ACL "The People's Web meets NLP" Workshop in 2009 that attracted 21 submissions. Accepted submissions included papers on Wikipedia [1], Wiktionary [2], Mechanical Turk [3], and game-based construction of semantic resources [4]. This clearly demonstrates a substantial and growing interest of the NLP community in collaboratively constructed semantic resources (CSRs), also evidenced by the increasing number of publications in this area and the EMNLP 2009 Web 2.0 track. In many works, CSRs have been used to overcome the knowledge acquisition bottleneck and coverage problems pertinent to conventional lexical semantic resources. The greatest popularity in this respect can so far certainly be attributed to Wikipedia [1]. However, other resources, such as folksonomies or the multilingual collaboratively constructed dictionary Wiktionary, have also shown great potential. Thus, the scope of the workshop deliberately includes any collaboratively constructed resource, not only Wikipedia. Effective deployment of CSRs to enhance NLP introduces a pressing need to address a set of fundamental challenges, e.g. the interoperability with existing resources, or the quality of the extracted lexical semantic knowledge. Interoperability between resources is crucial as no single resource provides perfect coverage. The quality of CSRs is a fundamental issue, as they lack editorial control and entries are often incomplete. Thus, techniques for link prediction [5] or information extraction [6] have been proposed to guide the "crowds" while constructing resources of better quality. [1] Olena Medelyan, David Milne, Catherine Legg and Ian H. Witten. Mining meaning from Wikipedia. In: International Journal of Human-Computer Studies. 67(9), 2009. [2] Torsten Zesch, Christof Mueller and Iryna Gurevych Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary Proceedings of the Conference on Language Resources and Evaluation (LREC), 2008. http://www.ukp.tu-darmstadt.de/software/jwpl/ http://www.ukp.tu-darmstadt.de/software/jwktl/ [3] Rion Snow, Brendan O'Connor, Daniel Jurafsky and Andrew Y. Ng. Cheap and Fast---But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks. Proceedings of EMNLP. 2008. [4] Luis von Ahn and Laura Dabbish. General Techniques for Designing Games with a Purpose. Communications of the ACM, 2008. [5] Rada Mihalcea and Andras Csomai Wikify!: Linking Documents to Encyclopedic Knowledge. Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007. [6] Daniel S. Weld et al. Intelligence in Wikipedia. Twenty-Third Conference on Artificial Intelligence (AAAI), 2008. TOPICS The workshop will bring together researchers from different worlds, for example those using collaboratively constructed resources as sources of lexical semantic information for NLP purposes such as information retrieval, named entity recognition, or keyword extraction, and those using NLP techniques to improve the resources or extract and analyze different types of lexical semantic information from them. We will especially welcome contributions of interdisciplinary nature, e.g. those applying discourse analysis techniques from computational linguistics to the content of CSRs to better understand their properties. Specific topics include but are not limited to: * Analysis of collaboratively constructed resources, such as wiki-based platforms, folksonomies, Twitter, or social networks; * Using collaboratively constructed resources for NLP purposes such as information retrieval, text categorization, information extraction, etc.; * Using special features of collaboratively constructed resources to create novel resource types, for example revision-based corpora, simplified versions of resources, etc.; * Analyzing the structure of collaboratively constructed resources related to their use in NLP; * Interoperability of collaboratively constructed resources with conventional lexical semantic resources and between themselves; * Mining social and collaborative content for constructing structured semantic resources and the corresponding tools; * Mining multilingual information from collaboratively constructed resources; * Quality and reliability of collaboratively constructed semantic resources. We especially encourage short papers describing publicly available tools for accessing or analyzing collaboratively constructed resources that can serve as a multiplier in the NLP community. The workshop is intended to be highly interdisciplinary. Thus, we encourage the participation of researchers working on computational linguistics aspects (e.g. parsing or discourse analysis) or NLP applications (e.g. information retrieval, information extraction, question answering, and knowledge representation) as well as researchers from other areas who might benefit from collaboratively constructed semantic resources. Substantially extended versions of the best papers from the workshop can be submitted to a planned Special Issue in one of the major computational linguistics journals. The revised papers will have to undergo a separate reviewing process required for journal publications. IMPORTANT DATES Paper submission deadline (full and short): May 30, 2010 Notification of acceptance of papers: June 30, 2010 Camera-ready copy of papers due: July 10, 2010 COLING 2010 Workshop: Aug 28, 2010 ORGANIZERS Iryna Gurevych Torsten Zesch Ubiquitous Knowledge Processing Lab Technische Universität Darmstadt, Germany PROGRAM COMMITTEE Andras Csomai Google Inc. Anette Frank Heidelberg University Benno Stein Bauhaus University Weimar Bernardo Magnini ITC-irst Trento Christiane Fellbaum Princeton University Dan Moldovan University of Texas at Dallas Delphine Bernhard LIMSI-CNRS, Orsay Diana McCarthy University of Sussex Elke Teich Technische Universität Darmstadt Emily Pitler University of Pennsylvania Eneko Agirre University of the Basque Country Erhard Hinrichs Eberhard Karls Universität Tübingen Ernesto De Luca Technische Universität Berlin Florian Laws University of Stuttgart Gerard de Melo MPI Saarbrücken German Rigau University of the Basque Country Graeme Hirst University of Toronto Günter Neumman DFKI Saarbrücken György Szarvas Technische Universität Darmstadt Hans-Peter Zorn European Media Lab, Heidelberg José Iria University of Sheffield Laurent Raumary LORIA, Nancy Magnus Sahlgren Swedish Institute of Computer Science Manfred Stede Potsdam University Omar Alonso A9.com, Inc. Pablo Castells Universidad Autónonoma de Madrid Paul Buitelaar DERI, National University of Ireland, Galway Philipp Cimiano Delft University of Technology Razvan Bunescu University of Texas at Austin Rene Witte Concordia University Montréal Roxana Girju University of Illinois at Urbana-Champaign Saif Mohammad University of Maryland Samer Hassan University of North Texas Sören Auer Leipzig University Tonio Wandmacher CEA, Paris

14 years

Help to solve three doubts on Wikipedia research data

by Fuster, Mayo

Hi everybody! How are you? I hope happy and fine. I am Mayo Fuster Morell doing a Phd research on Wikipedia governance at the European University Institute. I would appreciate if you could help me with three specific doubts that I have on Wikipedia data. * Is there data or research results on number of users per article? Plus, what is the more frequent number of users per article? Or, what is the distribution of number of editors/article? * Is there data on gender distribution? I know there is the general survey (which say 12.83% women are contributors) and Ortega works. Is there any other data on gender distribution?. * Does the site learn from the navigation and searches? That is, if a Wikipedia visitor who reads a Network entry then goes to the Manuel Castells entry, Will the system understand there is a connexion between them? Will next time put them together when presenting search results? Any suggestion to solve these doubts is very welcome! Looking forward for Wikimania! Mayo «·´`·.(*·.¸(`·.¸ ¸.·´)¸.·*).·´`·» «·´¨*·¸¸« Mayo Fuster Morell ».¸.·*¨`·» «·´`·.(¸.·´(¸.·* *·.¸)`·.¸).·´`·» Research Digital Commons Governance: http://www.onlinecreation.info European University Institute - Phd Candidate School of information Berkeley Visiting researcher Phone Italy: 0039-3345440747 or 0039-0558409982 Phone Spanish State: 0034-648877748 E-mail: mayo.fuster(a)eui.eu Skype: mayoneti Identi.ca: Mayo Postal address: Badia Fiesolana - Via dei Roccettini 9, I-50014 San Domenico di Fiesole (FI) - Italy Fax [+39] 055 4685 201

14 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l April 2010