Am 23.05.2012 18:29, schrieb Lars Aronsson:
On 2012-05-23 17:36, Christoph Lauer wrote:
I'm working on the Entry Layouts, which extract wiktionary data into the dbpedia framework. The first thing I'm interested in the link to the base form of an inflected verb/adjective.
Which language of Wiktionary, and what is your source format? In the English Wiktionary, the category tree under http://en.wiktionary.org/wiki/Category:Form-of_templates_by_language will guide you to wiki templates used to express that an entry is an inflected form of a base word.
The template I wrote was for the english wiktionary. I'm not sure what you mean by source format; the entry layouts follow the XML standard as described here: http://wiktionary.dbpedia.org/ (just to make sure we're not talking cross purposes ;-) ).
For example, two levels down, you will find http://en.wiktionary.org/wiki/Category:Swedish_form-of_templates and http://en.wiktionary.org/wiki/Template:sv-adj-form-abs-indef-n which is used in the entry http://en.wiktionary.org/wiki/oveders%C3%A4gligt to specify that this word is a form of a Swedish adjective. The base word is the first and only parameter.
Interesting that the english subcategoy is practically empty whereas the swedish subcategory has lots of information about templates. In the given word 'ovedersägligt' you can see that the Wiki code for the 'Adjective' subcategory is
===Adjective=== {{head|sv|adjective form}}
# {{sv-adj-form-abs-indef-n|ovedersäglig}}
With the template I want to catch the last line from that to extract the link to the base form 'ovedersäglig'. DBpedia doesnt have a swedish database (yet), but if you take for example the english word 'took', then you can see the entry under http://wiktionary.dbpedia.org/page/took. There's no reference to the base form, so I would like to add it. Thats what it's all about ;-)