-- Nadja Kutz wrote
Is it possible to briefly explain the major differences
between
DBpedia and the Yago Knowledge graph?
Both projects aim to extract a so-called ontology from Wikipedia. An
ontology in this sense is a graph (= a kind of net), in which the
nodes are entities (like Albert Einstein, or the city of Ulm) and the
links between the nodes are relationships (like "wasBornIn"). See here
for an example:
http://www.mpi-inf.mpg.de/departments/ontologies/areas/index.html
For this purpose, both projects use the structured information of
Wikipedia, i.e., its category system and its infoboxes. Both projects
have extracted a graph of several million nodes, and dozens of
millions of links between them. Seen this way, the projects go in the
same direction as what Wikidata aims to do, but in an automated
fashion.
Both projects share the same goal, but have different foci:
* YAGO has a set of around 100 relationships and maps Wikipedia
infobox attributes to them. DBpedia, in contrast, has two systems
- one system, in which each Wikipedia infobox attribute becomes a
relationship. This set of data is rather noisy, but very exhaustive.
- another system, in which relationships are defined and mapped from
infobox attributes by a community of voluteers.
The differences between these two systems are summarized here in Chapter 10.3
http://www.mpi-inf.mpg.de/yago-naga/yago/publications/aij.pdf
* DBpedia is the hub of the linked data cloud. YAGO is also in this
cloud, but not as central as DBpedia.
* YAGO attaches time and space information to many of its entities,
i.e., it knows when and where certain facts happened, and integrates
this information with data from Geonames. This aspect is less
prominent in DBpedia.
* YAGO has traditionally put much emphasis on logical constraint
checking, type checking, and a strong type hierarchy -- all in order
to maintain a high precision of the data. DBpedia, in contrast,
imports one of its type hierarchies from YAGO, and builds its own,
flatter, type hierarchy through a community of volunteers.
* YAGO has been evaluated manually, thus attaching a probabilistic
precision value to each of its relations. That is, for the relation
"actedInMovie", e.g., we know that 97% of the statements are correct.
Details are here:
http://www.mpi-inf.mpg.de/yago-naga/yago/statistics.html
For DBpedia, there is no such analysis to our knowledge.
* Both ontologies have a surprisingly small overlap of data and
instances, if mapped naively, see Chapter 5.2.3 in
http://suchanek.name/work/publications/iswc2011.pdf
... but a larger overlap if mapped in a more sophisticated way, see
Section 6.4 in
http://suchanek.name/work/publications/vldb2012.pdf
* there are certainly a number of more differences, which I may not
know or I may have overseen, please feel free to add.
what is the www conference ?
The WWW conference is a scientific conference in computer science on
the newest developments of the Web. It is in Lyon this year:
http://www2012.wwwconference.org
Cheers
Fabian
--
Fabian online:
http://suchanek.name