The DBpedia Wiktionary parser does not have a special use case. It aims
for flexibility.
The parser can be configured by anyone to fit their use case. It is
also not limited to Wiktionary, we intend to parse other Wikis such as
http://wikihow.com orhttp://wikitravel.org as well
DBpedia Wiktionary follows several visions:
1. if it is possible to get the data that you have put into Wiktionary
out again, Wiktionary will be strengthened as a central resource.
2. Efforts to extract data from Wiktionary can be focused into one
collaborative project. Therefore not everybody has to write his/her own
parser.
3. DBpedia Wiktionary has the potential to become a major hub of:
http://linguistics.okfn.org/resources/llod/ as DBpedia is the central
hub of
http://richard.cyganiak.de/2007/10/lod/
It will need some more work to improve the config files step by step for
each language, but it is not unrealistic. During the next week, we will
add dumps for several more languages. We will migrate the config files
somewhere user-friendly. So people who want to get data, will have no
need to download and install software and know mercurial or Scala.
Sebastian
On 05/14/2012 09:19 PM, Lars Aronsson wrote:
On 2012-05-14 16:54, Christoph Lauer wrote:
However my central problem was that none of these
informations aren't
available in the RDF dumps or through the SPARQL endpoint
http://wiktionary.dbpedia.org/sparql, neither born -> bear, nor bear ->
Wiktionary is highly concentrated: A few people and a few templates
generate the vast majority of the content. I think I created half
of the Swedish language entries in the English Wiktionary. If the
people (who?) who run
dbpedia.org can explain their needs, perhaps
the templates used in Wiktionary can better support the extraction
of structured data. I don't recall getting any feedback from them.
For the purpose of Swedish entries in the English Wiktionary, "född"
(born, geboren) is treated as an adjective (since it is inflected as
an adjective), with its role as participle of the verb being indicated
in the etymology section. The template
{{sv-verb-form-pastpart|föda}}
expands to the text "past participle of föda" and also adds a
category: Swedish past participles,
but it doesn't contain any other mark-up that says this is a
past participle. I have no idea how this is treated by dbpedia.
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects:
http://nlp2rdf.org ,
http://dbpedia.org
Homepage:
http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group:
http://aksw.org