This looks very interesting!
Is this a thesaurus that can be used for translation of words across
languages?
Is there some way to quickly have a demo or view the data?
I browsed some files, and I see entries of the kind:
:xf5bfa ww:displayLabel "de:Feliner_Diabetes_mellitus" .
:xf5bfa ww:type wwct:OTHER .
:xf5bfa rdf:type skos:Concept .
:xf5bfa skos:inScheme <
which tells me that Diabetes Mellitus of a feline is a concept... I was
interested in the animal thesaurus as a way to translate animal names across
languages... there are a lot of files, and I don't know if I am looking at
the right ones. Perhaps if you pointed us to the most interesting /
understandable datasets, it would be very useful.
I am sorry if the above remarks seem superficial; I cannot read German well
enough to read dissertations in it...
Best, Luca.
On Fri, May 30, 2008 at 2:54 AM, Daniel Kinzler <daniel(a)brightbyte.de>
wrote:
My diploma thesis about a system to automatically
build a multilingual
thesaurus
from wikipedia, "WikiWord", is finally done. I handed it in yesterday. My
research will hopefully help to make Wikipedia more accessible for
automatic
processing, especially for applications natural languae processing, machine
translation and information retrieval. What this could mean for Wikipedia
is:
better search and conceptual navigation, tools for suggesting categories,
and more.
Here's the thesis (in German, i'm afraid): <
http://brightbyte.de/DA/WikiWord.pdf>
Daniel Kinzler, "Automatischer Aufbau eines multilingualen Thesaurus durch
Extraktion semantischer und lexikalischer Relationen aus der Wikipedia",
Diplomarbeit an der Abteilung für Automatische Sprachverarbeitung,
Institut
für Informatik, Universität Leipzig, 2008.
For the curious,
http://brightbyte.de/DA/ also contains source code and
data.
See <http://brightbyte.de/page/WikiWord> for more information.
Some more data is for now avialable at
<http://aspra27.informatik.uni-leipzig.de/~dkinzler/rdfdumps/<http://aspra27.informatik.uni-leipzig.de/%7Edkinzler/rdfdumps/>>.
This includes
full SKOS dumps for en, de, fr, nl, and no covering about six million
concepts.
The thesis ended up being rather large... 220 pages thesis and 30k lines of
code. I'm plannign to write a research paper in english soon, which will
give an
overview over WikiWord and what it can be used for.
The thesis is licensed under the GFDL, WikiWord is GPL software. All data
taken
or derived from wikipedia is GFDL.
Enjoy,
Daniel
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l