-- Nadja Kutz wrote
Is it possible to briefly explain the major differences between DBpedia and the Yago Knowledge graph?
Both projects aim to extract a so-called ontology from Wikipedia. An ontology in this sense is a graph (= a kind of net), in which the nodes are entities (like Albert Einstein, or the city of Ulm) and the links between the nodes are relationships (like "wasBornIn"). See here for an example: http://www.mpi-inf.mpg.de/departments/ontologies/areas/index.html
For this purpose, both projects use the structured information of Wikipedia, i.e., its category system and its infoboxes. Both projects have extracted a graph of several million nodes, and dozens of millions of links between them. Seen this way, the projects go in the same direction as what Wikidata aims to do, but in an automated fashion.
Both projects share the same goal, but have different foci:
* YAGO has a set of around 100 relationships and maps Wikipedia infobox attributes to them. DBpedia, in contrast, has two systems - one system, in which each Wikipedia infobox attribute becomes a relationship. This set of data is rather noisy, but very exhaustive. - another system, in which relationships are defined and mapped from infobox attributes by a community of voluteers. The differences between these two systems are summarized here in Chapter 10.3 http://www.mpi-inf.mpg.de/yago-naga/yago/publications/aij.pdf
* DBpedia is the hub of the linked data cloud. YAGO is also in this cloud, but not as central as DBpedia.
* YAGO attaches time and space information to many of its entities, i.e., it knows when and where certain facts happened, and integrates this information with data from Geonames. This aspect is less prominent in DBpedia.
* YAGO has traditionally put much emphasis on logical constraint checking, type checking, and a strong type hierarchy -- all in order to maintain a high precision of the data. DBpedia, in contrast, imports one of its type hierarchies from YAGO, and builds its own, flatter, type hierarchy through a community of volunteers.
* YAGO has been evaluated manually, thus attaching a probabilistic precision value to each of its relations. That is, for the relation "actedInMovie", e.g., we know that 97% of the statements are correct. Details are here: http://www.mpi-inf.mpg.de/yago-naga/yago/statistics.html For DBpedia, there is no such analysis to our knowledge.
* Both ontologies have a surprisingly small overlap of data and instances, if mapped naively, see Chapter 5.2.3 in http://suchanek.name/work/publications/iswc2011.pdf ... but a larger overlap if mapped in a more sophisticated way, see Section 6.4 in http://suchanek.name/work/publications/vldb2012.pdf
* there are certainly a number of more differences, which I may not know or I may have overseen, please feel free to add.
what is the www conference ?
The WWW conference is a scientific conference in computer science on the newest developments of the Web. It is in Lyon this year: http://www2012.wwwconference.org
Cheers
Fabian
-- Fabian online: http://suchanek.name