Dear all,
TL;DR; We are working on an *experimental* Wikidata RDF export based on DBpedia and would like some feedback on our future directions.
Disclaimer: this work is not related or affiliated with the official Wikidata RDF dumps.
Our current approach is to use Wikidata like all other Wikipedia editions and apply our extractors to each Wikidata page (item). This approach generates triples in the DBpedia domain ( http://wikidata.dbpedia.org/resource/). Although this results in duplication, since Wikidata already provides RDF, we made some different design choices and map wikidata data directly into the DBpedia ontology.
sample data: http://nl.dbpedia.org/downloads/wikidatawiki/sample/
experimental dump: http://nl.dbpedia.org/downloads/wikidatawiki/20150207/ (errors see below)
*Wikidata mapping details*
In the same way we use mappings.dbpedia.org to define mappings from Wikipedia templates to the DBpedia ontology, we define transformation mappings from Wikidata properties to RDF triples in the DBpedia ontology.
At the moment we provide two types of Wikidata property mappings:
a) through the mappings wiki in the form of equivalent classes or properties e.g.
property: http://mappings.dbpedia.org/index.php/OntologyProperty:BirthDate
Class: http://mappings.dbpedia.org/index.php/OntologyClass:Person
which will result in the following triples:
wd:Qx a dbo:Person
wd:Qx dbo:birthDate “....”
b) transformation mappings that are (for now) defined in a json file [1]. At the moment we provide the following mappings options:
-
Predefined values -
"P625": {"rdf:type":" http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing%22%7D will result in: wd:Qx a geo:SpatialThing -
Value formatting with a string containing $1 -
"P214": {"owl:sameAs": "http://viaf.org/viaf/$1%22%7D will result in: wd:Qx owl:sameAs http://viaf.org/viaf/%7BwikidataValue%7D http://viaf.org/viaf/viafID . -
Value formatting with predefined functions. The following are supported for now -
$getDBpediaClass: returns the equivalent DBpedia class for a Wikidata item (using the mappings wiki) -
$getLatitude, $getLongitude & $getGeoRss: geo-related functions
Also note that we can define multiple mappings per property to get the Wikidata data closer to the DBpedia RDF exports e.g.:
"P625": [
{"rdf:type":"http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing%22%7D,
{"geo:lat":"$getLatitude"},
{"geo:long": "$getLongitude"},
{"georss:point":"$getGeoRss"}],
"P18": [
{"thumbnail":" http://commons.wikimedia.org/wiki/Special:FilePath/$1?width=300%22%7D,
{"foaf:depiction":"http://commons.wikimedia.org/wiki/Special:FilePath/$1%22%7D],
*Qualifiers & reification*
Like Wikidata we provide a simplified dump without qualifiers and a reified dump with qualifiers. However, for the reification we chose simple RDF reification in order to reuse the DBpedia ontology as much as possible. The reified dumps are also mapped using the same configuration.
*Labels, descriptions, aliases and interwiki links*
We additionally defined extractors to get data other than statements. For textual data we split the dumps to the languages that are enabled in the mappings wiki and all the rest. We extract aliases, labels, descriptions, site links. For interwiki links we provide links between Wikidata and DBpedia as well as links between different DBpedia language editions.
*Properties*
We also fully extract wikidata property pages. However, for now we don’t apply any mappings to wikidata properties.
*DBpedia extractors*
Some existing DBpedia extractors also apply in Wikidata that provide versioning and provenance (e.g. pageID, revisionID, etc)
*Help & Feedback*
Although this is a work in progress we wanted to announce it early and get you feedback on the following:
-
Are we going in the right direction? -
Did we overlook something or is something missing? -
Are there any other mapping options we should include? -
Where should we host the advanced json mappings? -
One option is in the mappings wiki, another one is in Wikidata directly or a separate github project
It would be great if you could help us map more data. The easiest way is through the mappings wiki where you can define equivalent classes & properties. See what is missing here: http://mappings.dbpedia.org/server/ontology/wikidata/missing/
You can also provide json configuration but until the code is merged it will not be easy with PRs.
Until the code is merged in the main DBpedia repo you can check it out from here:
https://github.com/alismayilov/extraction-framework/tree/wikidataAllCommits
Notes:
-
we use the Wikidata-Toolkit for reading the json structure which is a great project btw -
The full dump we provide is not complete due to a Wikidata dump export bug. The compressed files are not closed correctly due to this.
Best,
Ali Ismayilov, Dimitris Kontokostas, Sören Auer
[1] https://github.com/alismayilov/extraction-framework/blob/wikidataAllCommits/...
Dimitris, Soren, and DBpedia team,
That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences:
On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas < kontokostas@informatik.uni-leipzig.de> wrote:
we made some different design choices and map wikidata data directly into the DBpedia ontology.
What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers?
Tom
Dear Tom,
let me try to answer this question in a more general way. In the future, we honestly consider to map all data on the web to the DBpedia ontology (extending it where it makes sense). We hope that this will enable you to query many data sets on the Web using the same queries.
As a convenience measure, we will get a huge download server that provides all data from a single point in consistent formats and consistent metadata, classified by the DBpedia Ontology. Wikidata is just one example, there is also commons, Wiktionary (hopefully via DBnary), data from companies, DBpedia members and EU projects.
all the best, Sebastian
On 11.03.2015 06:11, Tom Morris wrote:
Dimitris, Soren, and DBpedia team,
That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences:
On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de mailto:kontokostas@informatik.uni-leipzig.de> wrote:
we made some different design choices and map wikidata data directly into the DBpedia ontology.
What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers?
Tom
Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/
Dbpedia-developers mailing list Dbpedia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
I think it is a very good idea to express facts in a "standard" vocabulary, so I am excited to see Wikidata available with DBpedia vocabulary -- this is a quick answer for people who like DBpedia to get better data.
There is the minus that there will always been some loss in conversion between vocabularies, but I think that will be more than made up by having a data set than many people will be able to use right out of the gate.
Kudos!
On Wed, Mar 11, 2015 at 3:12 AM, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> wrote:
Dear Tom,
let me try to answer this question in a more general way. In the future, we honestly consider to map all data on the web to the DBpedia ontology (extending it where it makes sense). We hope that this will enable you to query many data sets on the Web using the same queries.
As a convenience measure, we will get a huge download server that provides all data from a single point in consistent formats and consistent metadata, classified by the DBpedia Ontology. Wikidata is just one example, there is also commons, Wiktionary (hopefully via DBnary), data from companies, DBpedia members and EU projects.
all the best, Sebastian
On 11.03.2015 06:11, Tom Morris wrote:
Dimitris, Soren, and DBpedia team,
That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences:
On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas < kontokostas@informatik.uni-leipzig.de> wrote:
we made some different design choices and map wikidata data directly into the DBpedia ontology.
What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers?
Tom
Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/
Dbpedia-developers mailing listDbpedia-developers@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/dbpedia-developers
-- Sebastian Hellmann AKSW/NLP2RDF research group Insitute for Applied Informatics (InfAI) and DBpedia Association Events:
- *Feb 9th, 2015* 3rd DBpedia Community Meeting in Dublin
http://wiki.dbpedia.org/meetings/Dublin2015
- *May 29th, 2015* Submission deadline SEMANTiCS 2015
- *Sept 15th-17th, 2015* SEMANTiCS 2015 (formerly i-SEMANTICS), Vienna
http://semantics.cc/ Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt http://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org Thesis: http://tinyurl.com/sh-thesis-summary http://tinyurl.com/sh-thesis
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
This is a very ambitious, but commendable, goal. To map all data on the web to the DBpedia ontology is a huge undertaking that will take many years of effort. However, if it can be accomplished the potential payoff is also huge and could result in the realization of a true Semantic Web. Just as with any very large and complex software development effort, there needs to be a structured approach to achieving the desired results. That structured approach probably involves a clear requirements analysis and resulting requirements documentation. It also requires a design document and an implementation document, as well as risk assessment and risk mitigation. While there is no bigger believer in the "build a little, test a little" rapid prototyping approach to development, I don't think that is appropriate for a project of this size and complexity. Also, the size and complexity also suggest the final product will likely be beyond the scope of any individual to fully comprehend the overall ontological structure. Therefore, a reasonable approach might be to break the effort into smaller, comprehensible segments. Since this is a large ontology development effort, segmenting the ontology into domains of interest and creating working groups to focus on each domain might be a workable approach. There would also need to be a working group that focus on the top levels of the ontology and monitors the domain working groups to ensure overall compatibility and reduce the likelihood of duplicate or overlapping concepts in the upper levels of the ontology and treats universal concepts such as space and time consistently. There also needs to be a clear, and hopefully simple, approach to mapping data on the web to the DBpedia ontology that will accommodate both large data developers and web site developers. It would be wonderful to see the worldwide web community get behind such an initiative and make rapid progress in realizing this commendable goal. However, just as special interests defeated the goal of having a universal software development approach (Ada), I fear the same sorts of special interests will likely result in a continuation of the current myriad development efforts. I understand the "one size doesn't fit all" arguments, but I also think "one size could fit a whole lot" could be the case here.
Respectfully,
John Flynn http://semanticsimulations.com
From: Sebastian Hellmann [mailto:hellmann@informatik.uni-leipzig.de] Sent: Wednesday, March 11, 2015 3:12 AM To: Tom Morris; Dimitris Kontokostas Cc: Wikidata Discussion List; dbpedia-ontology; dbpedia-discussion@lists.sourceforge.net; DBpedia-Developers Subject: Re: [Dbpedia-discussion] [Dbpedia-developers] DBpedia-based RDF dumps for Wikidata
Dear Tom,
let me try to answer this question in a more general way. In the future, we honestly consider to map all data on the web to the DBpedia ontology (extending it where it makes sense). We hope that this will enable you to query many data sets on the Web using the same queries.
As a convenience measure, we will get a huge download server that provides all data from a single point in consistent formats and consistent metadata, classified by the DBpedia Ontology. Wikidata is just one example, there is also commons, Wiktionary (hopefully via DBnary), data from companies, DBpedia members and EU projects.
all the best, Sebastian
On 11.03.2015 06:11, Tom Morris wrote: Dimitris, Soren, and DBpedia team,
That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences:
On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas kontokostas@informatik.uni-leipzig.de wrote: we made some different design choices and map wikidata data directly into the DBpedia ontology.
What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers?
Tom
------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________ Dbpedia-developers mailing list Dbpedia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
Your description sounds quite close to what we had in mind. The high level group is manifesting quite well, the domain groups are planned as pilots for selected domains (e.g. Law or Mobility).
I lost a bit the overview on the data classification. We might auto-link or crowdsource. I would need to ask others, however.
We are aiming to create a structure that allows stability and innovation in an economic way - - I see this as the real challenge...
Jolly good show, Sebastian
On 11 March 2015 20:53:55 CET, John Flynn jflynn12@verizon.net wrote:
This is a very ambitious, but commendable, goal. To map all data on the web to the DBpedia ontology is a huge undertaking that will take many years of effort. However, if it can be accomplished the potential payoff is also huge and could result in the realization of a true Semantic Web. Just as with any very large and complex software development effort, there needs to be a structured approach to achieving the desired results. That structured approach probably involves a clear requirements analysis and resulting requirements documentation. It also requires a design document and an implementation document, as well as risk assessment and risk mitigation. While there is no bigger believer in the "build a little, test a little" rapid prototyping approach to development, I don't think that is appropriate for a project of this size and complexity. Also, the size and complexity also suggest the final product will likely be beyond the scope of any individual to fully comprehend the overall ontological structure. Therefore, a reasonable approach might be to break the effort into smaller, comprehensible segments. Since this is a large ontology development effort, segmenting the ontology into domains of interest and creating working groups to focus on each domain might be a workable approach. There would also need to be a working group that focus on the top levels of the ontology and monitors the domain working groups to ensure overall compatibility and reduce the likelihood of duplicate or overlapping concepts in the upper levels of the ontology and treats universal concepts such as space and time consistently. There also needs to be a clear, and hopefully simple, approach to mapping data on the web to the DBpedia ontology that will accommodate both large data developers and web site developers. It would be wonderful to see the worldwide web community get behind such an initiative and make rapid progress in realizing this commendable goal. However, just as special interests defeated the goal of having a universal software development approach (Ada), I fear the same sorts of special interests will likely result in a continuation of the current myriad development efforts. I understand the "one size doesn't fit all" arguments, but I also think "one size could fit a whole lot" could be the case here.
Respectfully,
John Flynn http://semanticsimulations.com
From: Sebastian Hellmann [mailto:hellmann@informatik.uni-leipzig.de] Sent: Wednesday, March 11, 2015 3:12 AM To: Tom Morris; Dimitris Kontokostas Cc: Wikidata Discussion List; dbpedia-ontology; dbpedia-discussion@lists.sourceforge.net; DBpedia-Developers Subject: Re: [Dbpedia-discussion] [Dbpedia-developers] DBpedia-based RDF dumps for Wikidata
Dear Tom,
let me try to answer this question in a more general way. In the future, we honestly consider to map all data on the web to the DBpedia ontology (extending it where it makes sense). We hope that this will enable you to query many data sets on the Web using the same queries.
As a convenience measure, we will get a huge download server that provides all data from a single point in consistent formats and consistent metadata, classified by the DBpedia Ontology. Wikidata is just one example, there is also commons, Wiktionary (hopefully via DBnary), data from companies, DBpedia members and EU projects.
all the best, Sebastian
On 11.03.2015 06:11, Tom Morris wrote: Dimitris, Soren, and DBpedia team,
That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences:
On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas kontokostas@informatik.uni-leipzig.de wrote: we made some different design choices and map wikidata data directly into the DBpedia ontology.
What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers?
Tom
Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/
Dbpedia-developers mailing list Dbpedia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
-- Sebastian Hellmann AKSW/NLP2RDF research group Insitute for Applied Informatics (InfAI) and DBpedia Association Events:
- Feb 9th, 2015 3rd DBpedia Community Meeting in Dublin
http://wiki.dbpedia.org/meetings/Dublin2015
- May 29th, 2015 Submission deadline SEMANTiCS 2015
- Sept 15th-17th, 2015 SEMANTiCS 2015 (formerly i-SEMANTICS), Vienna
http://semantics.cc/ Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt http://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org Thesis: http://tinyurl.com/sh-thesis-summary http://tinyurl.com/sh-thesis
Sebastian,
Thanks very much for the explanation. It was a single missing word, "ontology," which led me astray. If the opening sentence had said "based on the DBpedia ontology," I probably would have figured it out. Your amplification of the underlying motivation helps me better understand what's driving this though.
I guess I had naively abandoned critical thinking and assumed DBpedia was dead now that we had WikiData without thinking about how the two could evolve / compete / cooperate / thrive.
Good luck!
Best regards, Tom
On Wed, Mar 11, 2015 at 4:29 PM, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> wrote:
Your description sounds quite close to what we had in mind. The high level group is manifesting quite well, the domain groups are planned as pilots for selected domains (e.g. Law or Mobility).
I lost a bit the overview on the data classification. We might auto-link or crowdsource. I would need to ask others, however.
We are aiming to create a structure that allows stability and innovation in an economic way - - I see this as the real challenge...
Jolly good show, Sebastian
On 11 March 2015 20:53:55 CET, John Flynn jflynn12@verizon.net wrote:
This is a very ambitious, but commendable, goal. To map all data on the web to the DBpedia ontology is a huge undertaking that will take many years of effort. However, if it can be accomplished the potential payoff is also huge and could result in the realization of a true Semantic Web. Just as with any very large and complex software development effort, there needs to be a structured approach to achieving the desired results. That structured approach probably involves a clear requirements analysis and resulting requirements documentation. It also requires a design document and an implementation document, as well as risk assessment and risk mitigation. While there is no bigger believer in the "build a little, test a little" rapid prototyping approach to development, I don't think that is appropriate for a project of this size and complexity. Also, the size and complexity also suggest the final product will likely be beyond the scope of any individual to fully comprehend the overall ontological structure. Therefore, a reasonable approach might be to break the effort into smaller, comprehensible segments. Since this is a large ontology development effort, segmenting the ontology into domains of interest and creating working groups to focus on each domain might be a workable approach. There would also need to be a working group that focus on the top levels of the ontology and monitors the domain working groups to ensure overall compatibility and reduce the likelihood of duplicate or overlapping concepts in the upper levels of the ontology and treats universal concepts such as space and time consistently. There also needs to be a clear, and hopefully simple, approach to mapping data on the web to the DBpedia ontology that will accommodate both large data developers and web site developers. It would be wonderful to see the worldwide web community get behind such an initiative and make rapid progress in realizing this commendable goal. However, just as special interests defeated the goal of having a universal software development approach (Ada), I fear the same sorts of special interests will likely result in a continuation of the current myriad development efforts. I understand the "one size doesn't fit all" arguments, but I also think "one size could fit a whole lot" could be the case here.
Respectfully,
John Flynn
http://semanticsimulations.com
*From:* Sebastian Hellmann [mailto:hellmann@informatik.uni-leipzig.de] *Sent:* Wednesday, March 11, 2015 3:12 AM *To:* Tom Morris; Dimitris Kontokostas *Cc:* Wikidata Discussion List; dbpedia-ontology; dbpedia-discussion@lists.sourceforge.net; DBpedia-Developers *Subject:* Re: [Dbpedia-discussion] [Dbpedia-developers] DBpedia-based RDF dumps for Wikidata
Dear Tom,
let me try to answer this question in a more general way. In the future, we honestly consider to map all data on the web to the DBpedia ontology (extending it where it makes sense). We hope that this will enable you to query many data sets on the Web using the same queries.
As a convenience measure, we will get a huge download server that provides all data from a single point in consistent formats and consistent metadata, classified by the DBpedia Ontology. Wikidata is just one example, there is also commons, Wiktionary (hopefully via DBnary), data from companies, DBpedia members and EU projects.
all the best, Sebastian
On 11.03.2015 06:11, Tom Morris wrote:
Dimitris, Soren, and DBpedia team,
That sounds like an interesting project, but I got lost between the statement of intent, below, and the practical consequences:
On Tue, Mar 10, 2015 at 5:05 PM, Dimitris Kontokostas < kontokostas@informatik.uni-leipzig.de> wrote:
we made some different design choices and map wikidata data directly into the DBpedia ontology.
What, from your point of view, is the practical consequence of these different design choices? How do the end results manifest themselves to the consumers?
Tom
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
Dbpedia-developers mailing list
Dbpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
-- Sebastian Hellmann AKSW/NLP2RDF research group Insitute for Applied Informatics (InfAI) and DBpedia Association Events:
- *Feb 9th, 2015* 3rd DBpedia Community Meeting in Dublin
http://wiki.dbpedia.org/meetings/Dublin2015
- *May 29th, 2015* Submission deadline SEMANTiCS 2015
- *Sept 15th-17th, 2015* SEMANTiCS 2015 (formerly i-SEMANTICS), Vienna
http://semantics.cc/ Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt http://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org Thesis: http://tinyurl.com/sh-thesis-summary http://tinyurl.com/sh-thesis
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Dimitris Kontokostas> we made some different design choices and map wikidata data directly into the DBpedia ontology.
I’m very interested in this. A simple example: bgwiki started keeping Place Hierarchy in Wikidata because it’s much less efficient to keep it in deeply nested subtemplates. This made it very hard for bgdbpedia to extract this info, because how do you mix e.g. dbo:partOf and wd:Pnnn? So this is a logical continuation of the first step, which was for DBpedia to source inter-language links (owl:sameAs) from WD. (I haven’t tracked the list in a while, could someone give me a link to such dump? Sorry)
Tom Morris> abandoned critical thinking and assumed DBpedia was dead now that we had WikiData
That’s quite false. Both have their strengths and weaknesses.
- DBpedia has much more info than Wikidata. For Chrissake, Wikidata doesn’t even have category>article assignments!
- Wikidata has more entities (true, "stubs"), a lot of them created for coreferencing (authority control) purposes. IMHO there’s a bit of a revolution in this domain, check out https://www.wikidata.org/wiki/Wikidata:WikiProject_Authority_control https://twitter.com/hashtag/coreferencing https://tools.wmflabs.org/mix-n-match/ VIAF is moving to Wikidata coreferencing, which will get them double the name forms, 300k orgs and 700k persons. This is Big Deal to any library of museum hack.
- Wikidata has easier access to labels. In DBpedia you have to do a wikiPageRedirects dance, and if you’re naïve you’ll assume “God Does Not Play Dice” is another name for Einstein
- Right now IMHO Wikidata has better direct types for persons. This is a shame and we need to fix it in DBpedia
without thinking about how the two could evolve / compete / cooperate / thrive.
We exactly have to think about this. Last few months I've been worrying that the two communities don't much talk to each other. We as humanity should leverage the strengths of both, to gain maximum benefits.
I've become active in both communities, and I feel no shame in such split loyalies. :-) I went to DBpedia Dublin, now'll go to GlamWiki Hague... It's structured data each way!
Some little incoherencies: - The DBpedia Extraction framework is a very ingenious thing, and the devs working on it are very smart. But can they compete with a thousand Wikidata bot writers? (plus Magnus who in my mind holds semi-God status) - Meanwhile, DBpedia can't muster a willing Wikipedia hack to service mappings.dbpedia.org, which is stuck in the stone age. - Wikidatians move around tons of data every day, but their understanding of RDF *as a community* is still a bit naïve. - DBpedia holds tons of structured data, but Wikidata seems to plan to source it by individual bot contributions... maybe in full in 5 years time? - DBpedia has grokked the black magic of dealing with hand-written *multilingual* units and conversions. Of a gazillion units http://mappings.dbpedia.org/index.php/DBpedia_Datatypes Last I looked, Wikidata folks shrugged this off with "too different from our data types" - https://tools.wmflabs.org/mix-n-match/ is a crowdsourcing Wonder upon God's Earth, but nary a DBpedian has heard of it IMHO
A Little Cooperation Goes a Long Way.
Cheers!
On 3/30/15 3:39 PM, Vladimir Alexiev wrote:
Dimitris Kontokostas> we made some different design choices and map wikidata data directly into the DBpedia ontology.
I’m very interested in this. A simple example: bgwiki started keeping Place Hierarchy in Wikidata because it’s much less efficient to keep it in deeply nested subtemplates. This made it very hard for bgdbpedia to extract this info, because how do you mix e.g. dbo:partOf and wd:Pnnn? So this is a logical continuation of the first step, which was for DBpedia to source inter-language links (owl:sameAs) from WD. (I haven’t tracked the list in a while, could someone give me a link to such dump? Sorry)
Tom Morris> abandoned critical thinking and assumed DBpedia was dead now that we had WikiData
That’s quite false. Both have their strengths and weaknesses.
DBpedia has much more info than Wikidata. For Chrissake, Wikidata doesn’t even have category>article assignments!
Wikidata has more entities (true, "stubs"), a lot of them created for coreferencing (authority control) purposes. IMHO there’s a bit of a revolution in this domain, check out https://www.wikidata.org/wiki/Wikidata:WikiProject_Authority_control https://twitter.com/hashtag/coreferencing https://tools.wmflabs.org/mix-n-match/ VIAF is moving to Wikidata coreferencing, which will get them double the name forms, 300k orgs and 700k persons. This is Big Deal to any library of museum hack.
Wikidata has easier access to labels. In DBpedia you have to do a wikiPageRedirects dance, and if you’re naïve you’ll assume “God Does Not Play Dice” is another name for Einstein
Right now IMHO Wikidata has better direct types for persons. This is a shame and we need to fix it in DBpedia
without thinking about how the two could evolve / compete / cooperate / thrive.
We exactly have to think about this. Last few months I've been worrying that the two communities don't much talk to each other.
Hmm..
We just have to keep on working towards common ground. These two projects are ridiculously complimentary! The only challenge is making that claim utterly obvious so that we reduce "mutual exclusivity" oriented distractions.
We as humanity should leverage the strengths of both, to gain maximum benefits.
Amen! Anything less is an utter shame.
I've become active in both communities, and I feel no shame in such split loyalies. :-)
I don't look a split loyalties. I see a fundamental obligation to put as much effort into getting everyone to see "mutual inclusion" where they currently see "mutual exclusion" . I could have a good old rant about what's been happening here, but it would be misconstrued (on a good day) and only serve to add more distraction to the goals that many of have in mind re. collaboration across all of these structured data projects.
I went to DBpedia Dublin, now'll go to GlamWiki Hague... It's structured data each way!
Yes!
Some little incoherencies:
- The DBpedia Extraction framework is a very ingenious thing, and the devs working on it are very smart. But can they compete with a thousand Wikidata bot writers? (plus Magnus who in my mind holds semi-God status)
Yes, but we don't need to look at it that way :) Be a reluctant warrior.
- Meanwhile, DBpedia can't muster a willing Wikipedia hack to service mappings.dbpedia.org, which is stuck in the stone age.
- Wikidatians move around tons of data every day, but their understanding of RDF *as a community* is still a bit naïve.
Yes, so we MUST teach teach teach, at every opportunity. But in a constructive rather than condescending way.
- DBpedia holds tons of structured data, but Wikidata seems to plan to source it by individual bot contributions... maybe in full in 5 years time?
- DBpedia has grokked the black magic of dealing with hand-written *multilingual* units and conversions. Of a gazillion units http://mappings.dbpedia.org/index.php/DBpedia_Datatypes Last I looked, Wikidata folks shrugged this off with "too different from our data types"
- https://tools.wmflabs.org/mix-n-match/ is a crowdsourcing Wonder upon God's Earth, but nary a DBpedian has heard of it IMHO
A Little Cooperation Goes a Long Way.
Amen, one more time !
Kingsley
Cheers!
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l