Would the WikiData community be interested in linking together Wikidata pages with Freebase entities? I've proposed a new property to link the two datasets here:
http://www.wikidata.org/wiki/Wikidata:Property_proposal/all#Freebase_identif...
Freebase already has interwiki links for many entities so it wouldn't be too hard to automatically determine the corresponding Wikidata pages. This would allow people to mash up both datasets and cross-reference facts more easily.
I commented on the property proposal thread to this effect, so you can answer there if you wish. I am generally supportive of creating external identifier links (in fact I've imported .98M external IDs). Still I would like for you to explain Google's motivations for this project, to make sure it's simpatico.
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
________________________________ From: wikidata-l-bounces@lists.wikimedia.org on behalf of Shawn Simister Sent: Friday, June 14, 2013 11:53 AM To: Discussion list for the Wikidata project. Subject: [Wikidata-l] Linking WikiData to Freebase
Would the WikiData community be interested in linking together Wikidata pages with Freebase entities? I've proposed a new property to link the two datasets here:
http://www.wikidata.org/wiki/Wikidata:Property_proposal/all#Freebase_identif...
Freebase already has interwiki links for many entities so it wouldn't be too hard to automatically determine the corresponding Wikidata pages. This would allow people to mash up both datasets and cross-reference facts more easily.
-- Shawn Simister
Knowledge Developer Relations Google
There are several ways that this would benefit both Google and Wikidata. First, we currently extract a lot of data from WP infoboxes and load that data into Freebase which eventually makes its way into the Knowledge Graph so linking the two datasets would make it easier for us to extract similar data from WikiData in the future. Many other tech companies and researchers are doing similar extraction projects from WP and Freebase so this would benefit them as well.
Secondly, we'd like to contribute (or enable the Wikidata community to contribute) the data that we've already extracted from WP infoboxes back to WikiData. I'm not quite sure what the best way to do that is (pull by the community is probably better than push by Google) but having the linkages between equivalent concepts is an important first step to sharing more data.
Lastly, the Freebase community does a lot of work to clean up data that was imported from WP, OpenLibrary, MusicBrainz, etc. including merging duplicate topics and splitting apart conflated topics. This is important but tedious work that often doesn't get pushed back to the original sources for no other reason than there simply isn't a well-defined process for how that should work. I'm hopeful that the WikiData community will find a way to benefit from the cleanup that we do in Freebase creating a virtuous cycle that improves the quality of both datasets.
On Fri, Jun 14, 2013 at 12:19 PM, Klein,Max kleinm@oclc.org wrote:
I commented on the property proposal thread to this effect, so you can answer there if you wish. I am generally supportive of creating external identifier links (in fact I've imported .98M external IDs). Still I would like for you to explain Google's motivations for this project, to make sure it's simpatico.
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
*From:* wikidata-l-bounces@lists.wikimedia.org on behalf of Shawn Simister *Sent:* Friday, June 14, 2013 11:53 AM *To:* Discussion list for the Wikidata project. *Subject:* [Wikidata-l] Linking WikiData to Freebase
Would the WikiData community be interested in linking together Wikidata pages with Freebase entities? I've proposed a new property to link the two datasets here:
http://www.wikidata.org/wiki/Wikidata:Property_proposal/all#Freebase_identif...
Freebase already has interwiki links for many entities so it wouldn't be too hard to automatically determine the corresponding Wikidata pages. This would allow people to mash up both datasets and cross-reference facts more easily.
-- Shawn Simister
Knowledge Developer Relations Google
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Shawn Simister, 14/06/2013 23:18:
There are several ways that this would benefit both Google and Wikidata. First, we currently extract a lot of data from WP infoboxes and load that data into Freebase which eventually makes its way into the Knowledge Graph so linking the two datasets would make it easier for us to extract similar data from WikiData in the future. Many other tech companies and researchers are doing similar extraction projects from WP and Freebase so this would benefit them as well.
Secondly, we'd like to contribute (or enable the Wikidata community to contribute) the data that we've already extracted from WP infoboxes back to WikiData. I'm not quite sure what the best way to do that is (pull by the community is probably better than push by Google) but having the linkages between equivalent concepts is an important first step to sharing more data.
I think this makes a lot sense. I also see no problem in Google or other entities which already extract data from Wikimedia projects directly operating bots pushing to Wikidata. It's actually something very good, provided that 1) community processes are followed to define and create properties, 2) other policies are followed, of course, 3) to allow inspection on 1-2 and more, the logic (and if possible code) of the extraction and injection are transparently described on the wiki.
Lastly, the Freebase community does a lot of work to clean up data that was imported from WP, OpenLibrary, MusicBrainz, etc. including merging duplicate topics and splitting apart conflated topics. This is important but tedious work that often doesn't get pushed back to the original sources for no other reason than there simply isn't a well-defined process for how that should work. I'm hopeful that the WikiData community will find a way to benefit from the cleanup that we do in Freebase creating a virtuous cycle that improves the quality of both datasets.
Yes, this is a case where collaboration should be particularly easy and productive. For a closer integration with MusicBrainz, they'd probably have to change their (non)licenses. https://en.wikipedia.org/wiki/MusicBrainz#Licensing (They seem to claim they're free content, but they aren't.)
Nemo
On Sat, Jun 15, 2013 at 3:52 AM, Federico Leva (Nemo) nemowiki@gmail.comwrote:
For a closer integration with MusicBrainz, they'd probably have to change their (non)licenses. <https://en.wikipedia.org/** wiki/MusicBrainz#Licensinghttps://en.wikipedia.org/wiki/MusicBrainz#Licensing> (They seem to claim they're free content, but they aren't.)
The core MusicBrainz facts (artists, albums, tracks) are in the public domain. It's the extra stuff like user generated comments, ratings, etc that they license.
Tom
Tom Morris, 15/06/2013 14:11:
On Sat, Jun 15, 2013 at 3:52 AM, Federico Leva (Nemo) wrote:
For a closer integration with MusicBrainz, they'd probably have to change their (non)licenses. <https://en.wikipedia.org/__wiki/MusicBrainz#Licensing <https://en.wikipedia.org/wiki/MusicBrainz#Licensing>> (They seem to claim they're free content, but they aren't.)
The core MusicBrainz facts (artists, albums, tracks) are in the public domain. It's the extra stuff like user generated comments, ratings, etc that they license.
Yes, but their licensing scheme is not very clear and PD is not a license.
Nemo
One can't license something for which one does/can not have the copyright.
Barry
On 17/06/13 23:03, Federico Leva (Nemo) wrote:
Tom Morris, 15/06/2013 14:11:
On Sat, Jun 15, 2013 at 3:52 AM, Federico Leva (Nemo) wrote:
For a closer integration with MusicBrainz, they'd probably have to change their (non)licenses. <https://en.wikipedia.org/__wiki/MusicBrainz#Licensing
https://en.wikipedia.org/wiki/MusicBrainz#Licensing> (They seem to claim they're free content, but they aren't.)
The core MusicBrainz facts (artists, albums, tracks) are in the public domain. It's the extra stuff like user generated comments, ratings, etc that they license.
Yes, but their licensing scheme is not very clear and PD is not a license.
Nemo
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Yes that was one of the issues raised in this paper:
"There is no money in Linked Data" Prateek Jain, Pascal Hitzler, Krzysztof Janowicz, Chitra Venkatramani => http://knoesis.wright.edu/pascal/pub/nomoneylod.pdf
-Nicolas.
On 6/17/13 3:08 PM, "Barry Norton" barry.norton@ontotext.com wrote:
One can't license something for which one does/can not have the copyright.
Barry
On 17/06/13 23:03, Federico Leva (Nemo) wrote:
Tom Morris, 15/06/2013 14:11:
On Sat, Jun 15, 2013 at 3:52 AM, Federico Leva (Nemo) wrote:
For a closer integration with MusicBrainz, they'd probably have to change their (non)licenses. <https://en.wikipedia.org/__wiki/MusicBrainz#Licensing
https://en.wikipedia.org/wiki/MusicBrainz#Licensing> (They seem to claim they're free content, but they aren't.)
The core MusicBrainz facts (artists, albums, tracks) are in the public domain. It's the extra stuff like user generated comments, ratings, etc that they license.
Yes, but their licensing scheme is not very clear and PD is not a license.
Nemo
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 18/06/2013 09:57, "Federico Leva (Nemo)" nemowiki@gmail.com wrote:
Barry Norton, 18/06/2013 00:08:
One can't license something for which one does/can not have the copyright.
Are you saying they have no way to use CC-0? For past or even future content? If both, can't they use PDM? If not, how can they say it's PD?
MusicBrainz changed the license of their core data (everything that matters) to CC0 a little while ago: https://musicbrainz.org/doc/About/Data_License
nick.
----------------------------- http://www.bbc.co.uk This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. -----------------------------
Nicholas Humfrey, 18/06/2013 12:57:
On 18/06/2013 09:57, "Federico Leva (Nemo)" nemowiki@gmail.com wrote:
Barry Norton, 18/06/2013 00:08:
One can't license something for which one does/can not have the copyright.
Are you saying they have no way to use CC-0? For past or even future content? If both, can't they use PDM? If not, how can they say it's PD?
MusicBrainz changed the license of their core data (everything that matters) to CC0 a little while ago: https://musicbrainz.org/doc/About/Data_License
Nice, I see there's even a definition of "The core data of the database". :) It's still unclear to me why the use an unfree license for the rest.
Nemo
Nick, thanks for pointing that out - this had completely slipped out of my mind, including in a conversation with Denny about Wikidata.
I believe the reason for the 'non-commercial' open license where MusicBrainz add value is that they need to make revenue to support their activities... including from the good folks at the BBC.
Barry
On 18/06/13 12:58, Federico Leva (Nemo) wrote:
Nicholas Humfrey, 18/06/2013 12:57:
On 18/06/2013 09:57, "Federico Leva (Nemo)" nemowiki@gmail.com wrote:
Barry Norton, 18/06/2013 00:08:
One can't license something for which one does/can not have the copyright.
Are you saying they have no way to use CC-0? For past or even future content? If both, can't they use PDM? If not, how can they say it's PD?
MusicBrainz changed the license of their core data (everything that matters) to CC0 a little while ago: https://musicbrainz.org/doc/About/Data_License
Nice, I see there's even a definition of "The core data of the database". :) It's still unclear to me why the use an unfree license for the rest.
Nemo
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Yes, that is right. I believe the reason for some of the data being under a less liberal license is to taint the live data feed that contains both the core CC0 data and the Supplementary data.
MusicBrainz customers (such as BBC) then pay money to Musicbrainz to get data updates every hour, which can be used for commercial purposes.
nick.
-----Original Message----- From: Barry Norton barry.norton@ontotext.com Reply-To: "Discussion list for the Wikidata project." wikidata-l@lists.wikimedia.org Date: Tuesday, 18 June 2013 13:01 To: "wikidata-l@lists.wikimedia.org" wikidata-l@lists.wikimedia.org, "denny.vrandecic@wikimedia.de" denny.vrandecic@wikimedia.de Subject: Re: [Wikidata-l] Linking WikiData to Freebase
Nick, thanks for pointing that out - this had completely slipped out of my mind, including in a conversation with Denny about Wikidata.
I believe the reason for the 'non-commercial' open license where MusicBrainz add value is that they need to make revenue to support their activities... including from the good folks at the BBC.
Barry
On 18/06/13 12:58, Federico Leva (Nemo) wrote:
Nicholas Humfrey, 18/06/2013 12:57:
On 18/06/2013 09:57, "Federico Leva (Nemo)" nemowiki@gmail.com wrote:
Barry Norton, 18/06/2013 00:08:
One can't license something for which one does/can not have the copyright.
Are you saying they have no way to use CC-0? For past or even future content? If both, can't they use PDM? If not, how can they say it's PD?
MusicBrainz changed the license of their core data (everything that matters) to CC0 a little while ago: https://musicbrainz.org/doc/About/Data_License
Nice, I see there's even a definition of "The core data of the database". :) It's still unclear to me why the use an unfree license for the rest.
Nemo
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
----------------------------- http://www.bbc.co.uk This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. -----------------------------
On Fri, Jun 14, 2013 at 5:18 PM, Shawn Simister simister@google.com wrote:
Secondly, we'd like to contribute (or enable the Wikidata community to contribute) the data that we've already extracted from WP infoboxes back to WikiData. I'm not quite sure what the best way to do that is (pull by the community is probably better than push by Google) but having the linkages between equivalent concepts is an important first step to sharing more data.
Hi Shawn, I have a couple of questions: 1) How do you store the provenance of the data? Which method do you use for sourcing your statements? 2) If the licenses for the data are mixed, how is it possible to know which data is CC0 and which CC-BY?
Cheers, Micru
I hope to cooperate
From: kleinm@oclc.org To: wikidata-l@lists.wikimedia.org Date: Fri, 14 Jun 2013 19:19:08 +0000 Subject: Re: [Wikidata-l] Linking WikiData to Freebase
I commented on the property proposal thread to this effect, so you can answer there if you wish. I am generally supportive of creating external identifier links (in fact I've imported .98M external IDs). Still I would like for you to explain Google's motivations for this project, to make sure it's simpatico.
Maximilian Klein
Wikipedian in Residence, OCLC
+17074787023
From: wikidata-l-bounces@lists.wikimedia.org on behalf of Shawn Simister
Sent: Friday, June 14, 2013 11:53 AM
To: Discussion list for the Wikidata project.
Subject: [Wikidata-l] Linking WikiData to Freebase
Would the WikiData community be interested in linking together Wikidata pages with Freebase entities? I've proposed a new property to link the two datasets here:
http://www.wikidata.org/wiki/Wikidata:Property_proposal/all#Freebase_identif...
Freebase already has interwiki links for many entities so it wouldn't be too hard to automatically determine the corresponding Wikidata pages. This would allow people to mash up both datasets and cross-reference facts more easily.