=======================
NLP & DBpedia Workshop 2013
=======================
Free, open, interoperable and multilingual NLP for DBpedia and
DBpedia for NLP: http://nlp-dbpedia2013.blogs.aksw.org/
Collocated with the International Semantic Web Conference 2013 (ISWC
2013)
21-22 October 2013, in Sydney, Australia (*Submission deadline July
8th*)
**********************************
Recently, the DBpedia community has experienced an immense increase
in activity and we believe, that the time has come to explore the
connection between DBpedia & Natural Language Processing (NLP)
in a yet unpreceded depth. The goal of this workshop can be
summarized by this (pseudo-) formula:
DBpedia has a long-standing tradition to provide useful data as well
as a commitment to reliable Semantic Web technologies and living
best practices. With the rise of WikiData, DBpedia is step-by-step
relieved from the tedious extraction of data from Wikipedia’s
infoboxes and can shift its focus on new challenges such as
extracting information from the unstructured article text as well as
becoming a testing ground for multilingual NLP methods.
Contribution
=========
Within the timeframe of this workshop, we hope to mobilize a
community of stakeholders from the Semantic Web area. We envision
the workshop to produce the following items:
* an open call to the DBpedia data consumer community will
generate a wish list of data, which is to be generated from
Wikipedia by NLP methods. This wish list will be broken down to
tasks and benchmarks and a GOLD standard will be created.
* the benchmarks and test data created will be collected and
published under an open license for future evaluation (inspired by
OAEI and UCI-ML). An overview of the benchmarks can be found here:
http://nlp-dbpedia2013.blogs.aksw.org/benchmarks
Important dates
===========
8 July 2013, Paper Submission Deadline
9 August 2013, Notification of accepted papers sent to authors
Motivation
=======
The central role of Wikipedia (and therefore DBpedia) for the
creation of a Translingual Web has recently been recognized by the
Strategic Research Agenda (cf. section 3.4, page 23) and most of the
contributions of the recently held Dagstuhl seminar on the
Multilingual Semantic Web also stress the role of Wikipedia for
Multilingualism. As more and more language-specific chapters of
DBpedia appear (currently 14 language editions), DBpedia is becoming
a driving factor for a Linguistic Linked Open Data cloud as well as
localized LOD clouds with specialized domains (e.g. the Dutch
windmill domain ontology created from http://nl.dbpedia.org ).
The data contained in Wikipedia and DBpedia have ideal properties
for making them a controlled testbed for NLP. Wikipedia and DBpedia
are multilingual and multi-domain, the communities maintaining these
resource are very open and it is easy to join and contribute. The
open license allows data consumers to benefit from the content and
many parts are collaboratively editable. Especially, the data in
DBpedia is widely used and disseminated throughout the Semantic Web.
NLP4DBpedia
==========
DBpedia has been around for quite a while, infusing the Web of Data
with multi-domain data of decent quality. These triples are,
however, mostly extracted from Wikipedia infoboxes. To unlock the
full potential of Wikipedia articles for DBpedia, the information
contained in the remaining part of the articles needs to be analysed
and triplified. Here, the NLP techniques may be of favour.
DBpedia4NLP
==========
On the other hand NLP, and information extraction techniques in
particular, involve various resources while processing texts from
various domains. These resources may be used e.g. as an element of a
solution e.g. gazetteer being an important part of a rule created by
an expert or disambiguation resource, or while delivering a solution
e.g. within machine learning approaches. DBpedia easily fits in both
of these roles.
We invite papers from both these areas including:
1. Knowledge extraction from text and HTML documents (especially
unstructured and semi-structured documents) on the Web, using
information in the Linked Open Data (LOD) cloud, and especially in
DBpedia.
2. Representation of NLP tool output and NLP resources as RDF/OWL,
and linking the extracted output to the LOD cloud.
3. Novel applications using the extracted knowledge, the Web of Data
or NLP DBpedia-based methods.
The specific topics are listed below.
Topics
=====
- Improving DBpedia with NLP methods
- Finding errors in DBpedia with NLP methods
- Annotation methods for Wikipedia articles
- Cross-lingual data and text mining on Wikipedia
- Pattern and semantic analysis of natural language, reading the
Web, learning by reading
- Large-scale information extraction
- Entity resolution and automatic discovery of Named Entities
- Multilingual entity recognition task of real world entities
- Frequent pattern analysis of entities
- Relationship extraction, slot filling
- Entity linking, Named Entity disambiguation, cross-document
co-reference resolution
- Disambiguation through knowledge base
- Ontology representation of natural language text
- Analysis of ontology models for natural language text
- Learning and refinement of ontologies
- Natural language taxonomies modeled to Semantic Web ontologies
- Use cases for potential data extracted from Wikipedia articles
- Use cases of entity recognition for Linked Data applications
- Impact of entity linking on information retrieval, semantic search
Furthermore, an informal list of NLP tasks can be found on this
Wikipedia page:
http://en.wikipedia.org/wiki/Natural_language_processing#Major_tasks_in_NLP
These are relevant for the workshop as long as they fit into the
DBpedia4NLP and NLP4DBpedia frame (i.e. the used data evolves
around Wikipedia and DBpedia).
Submission formats
==============
Paper submission
-----------------------
All papers must represent original and unpublished work that is not
currently under review. Papers will be evaluated according to their
significance, originality, technical content, style, clarity, and
relevance to the workshop. At least one author of each accepted
paper is expected to attend the workshop.
* Full research paper (up to 12 pages)
* Position papers (up to 6 pages)
* Use case descriptions (up to 6 pages)
* Data/benchmark paper (2-6 pages, depending on the size and
complexity)
Note: data and benchmarks papers are meant to provide a citable
reference for your data and benchmarks. We kindly require, that you
upload any data you use to our benchmark repository in parallel to
the submission. We recommend to use an open license (e.g. CC-BY),
but minimum requirement is free use. Please write to the mailing
list, if you have any problems.
Submission of data and use cases
--------------------------------------------
This workshop also targets non-academic users and developers. If you
have any (open) data (e.g. texts or annotations) that can be used
for benchmarking NLP tools, but do not want or needd to write an
academic paper about it, please feel free to just add it to this
table: http://tinyurl.com/nlp-benchmarks or upload it to our
repository: http://github.com/dbpedia/nlp-dbpedia
Also if you have any ideas, use cases or data requests please feel
free to just post them on our mailing list: nlp-dbpedia-public [at]
lists.informatik.uni-leipzig.de or send them directly to the chairs:
nlp-dbpedia2013 [at] easychair.org
Program committee
==============
* Guadalupe Aguado, Universidad Politécnica de Madrid, Spain
* Chris Bizer, Universität Mannheim, Germany
* Volha Bryl, Universität Mannheim, Germany
* Paul Buitelaar, DERI, National University of Ireland, Galway
* Charalampos Bratsas, OKFN, Greece, Αριστοτέλειο Πανεπιστήμιο
Θεσσαλονίκης, (Aristotle University of Thessaloniki), Greece
* Philipp Cimiano, CITEC, Universität Bielefeld, Germany
* Samhaa R. El-Beltagy, جامعة_النيل (Nile University), Egypt
* Daniel Gerber, AKSW, Universität Leipzig, Germany
* Jorge Gracia, Universidad Politécnica de Madrid, Spain
* Max Jakob, Neofonie GmbH, Germany
* Anja Jentzsch, Hasso-Plattner-Institut, Potsdam, Germany
* Ali Khalili, AKSW, Universität Leipzig, Germany
* Daniel Kinzler, Wikidata, Germany
* David Lewis, Trinity College Dublin, Ireland
* John McCrae, Universität Bielefeld, Germany
* Uro¨ Milo¨ević, Institut Mihajlo Pupin, Serbia
* Roberto Navigli, Sapienza, Università di Roma, Italy
* Axel Ngonga, AKSW, Universität Leipzig, Germany
* Asunción Gómez Pérez, Universidad Politécnica de Madrid, Spain
* Lydia Pintscher, Wikidata, Germany
* Elena Montiel Ponsoda, Universidad Politécnica de Madrid, Spain
* Giuseppe Rizzo, Eurecom, France
* Harald Sack, Hasso-Plattner-Institut, Potsdam, Germany
* Felix Sasaki, Deutsches Forschungszentrum für künstliche
Intelligenz, Germany
* Mladen Stanojević, Institut Mihajlo Pupin, Serbia
* Hans Uszkoreit, Deutsches Forschungszentrum für künstliche
Intelligenz, Germany
* Rupert Westenthaler, Salzburg Research, Austria
* Feiyu Xu, Deutsches Forschungszentrum für künstliche Intelligenz,
Germany
Contact
=====
Of course we would prefer that you will post any questions and
comments regarding NLP and DBpedia to our public mailing list at:
nlp-dbpedia-public [at] lists.informatik.uni-leipzig.de
If you want to contact the chairs of the workshop directly, please
write to:
nlp-dbpedia2013 [at] easychair.org
Kind regards,
Sebastian Hellmann, Agata Filipowska, Caroline Barrière,
Pablo N. Mendes, Dimitris Kontokostas