Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations.
Can you explain what "indexing" means in this context? Is there some type of matching process? How are duplicates resolved, if at all? Was the Wikidata info extracted from a dump or one of the APIs?
When I looked at the first person I picked at random, Pierre Berdoy (ID:269710), I see that both Wikidata and Wikipedia claim that he was born in Biarritz while the NYPL database claims he was born in Nashua, NH. So, it would appear that there are either two different people with the same name, born in different places, or the birth place is wrong.
http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... https://www.wikidata.org/wiki/Q3383941
Tom
On Tue, Dec 8, 2015 at 7:10 PM, David Lowe davidlowe@nypl.org wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Thanks, Tom. I'll have to look at this specific case when I'm back at work tomorrow, as it does seem you found something in error. As for my process: with WD, I queried out the label, description & country of citizenship, dob & dod of of everyone with occupation: photographer. After some cleaning, I can get the WD data formatted like my own (Name, Nationality, Dates). I can then do a simple match, where everything matches exactly. For the remainder, I then match names and dates- without Nationality, which is often very "soft" information. For those that pass a smell test (one is "English" the other is "British") I pass those along, too. For those with greater discrepancies, I look still closer. For those with still greater discrepancies, I manually, individually query my database for anyone with the same last name & same first initial to catch misspellings or different transliterations. I also occasionally put my entire database into open refine to catch instances where, for instance, a Chinese name has been given as FamilyName, GivenName in one source, and GivenName, FamilyName in another. In short, this is scrupulously- and manually- checked data. I'm not savvy enough to let an algorithm make my mistakes for me! But let me know if this seems to be more than bad luck of the draw- finding the conflicting data you found. I have also to say, I may suppress the Niepce Museum collection, as it's from a really crappy list of photographers in their collection which I found many years ago, and can no longer find. I don't want to blame them for the discrepancy, but that might be the source. I don't know. As I start to query out places of birth & death from WD in the next days, I expect to find more discrepancies. (Just today, I found dozens of folks whom ULAN gendered one way, and WD another- but were undeniably the same photographer. ) Thanks, David
On Tuesday, December 8, 2015, Tom Morris tfmorris@gmail.com wrote:
Can you explain what "indexing" means in this context? Is there some type of matching process? How are duplicates resolved, if at all? Was the Wikidata info extracted from a dump or one of the APIs?
When I looked at the first person I picked at random, Pierre Berdoy (ID:269710), I see that both Wikidata and Wikipedia claim that he was born in Biarritz while the NYPL database claims he was born in Nashua, NH. So, it would appear that there are either two different people with the same name, born in different places, or the birth place is wrong.
http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... https://www.wikidata.org/wiki/Q3383941
Tom
On Tue, Dec 8, 2015 at 7:10 PM, David Lowe <davidlowe@nypl.org javascript:_e(%7B%7D,'cvml','davidlowe@nypl.org');> wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations.
Wikidata mailing list Wikidata@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Wikidata@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/wikidata
In case you haven't come across it before http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry point to the Swedish database of (primarily early) photographers curated by the Nordic Museum in Stockholm.
It's not that well integrated into Wikidata yet but the plan is to fix that during early 2016. That would also allow a variety of photographs on Wikimedia Commons to be linked to these entries.
Cheers, André
André Costa | GLAM developer, Wikimedia Sverige | Andre.Costa@wikimedia.se | +46 (0)733-964574
Stöd fri kunskap, bli medlem i Wikimedia Sverige. Läs mer på blimedlem.wikimedia.se
On 9 December 2015 at 02:44, David Lowe davidlowe@nypl.org wrote:
Thanks, Tom. I'll have to look at this specific case when I'm back at work tomorrow, as it does seem you found something in error. As for my process: with WD, I queried out the label, description & country of citizenship, dob & dod of of everyone with occupation: photographer. After some cleaning, I can get the WD data formatted like my own (Name, Nationality, Dates). I can then do a simple match, where everything matches exactly. For the remainder, I then match names and dates- without Nationality, which is often very "soft" information. For those that pass a smell test (one is "English" the other is "British") I pass those along, too. For those with greater discrepancies, I look still closer. For those with still greater discrepancies, I manually, individually query my database for anyone with the same last name & same first initial to catch misspellings or different transliterations. I also occasionally put my entire database into open refine to catch instances where, for instance, a Chinese name has been given as FamilyName, GivenName in one source, and GivenName, FamilyName in another. In short, this is scrupulously- and manually- checked data. I'm not savvy enough to let an algorithm make my mistakes for me! But let me know if this seems to be more than bad luck of the draw- finding the conflicting data you found. I have also to say, I may suppress the Niepce Museum collection, as it's from a really crappy list of photographers in their collection which I found many years ago, and can no longer find. I don't want to blame them for the discrepancy, but that might be the source. I don't know. As I start to query out places of birth & death from WD in the next days, I expect to find more discrepancies. (Just today, I found dozens of folks whom ULAN gendered one way, and WD another- but were undeniably the same photographer. ) Thanks, David
On Tuesday, December 8, 2015, Tom Morris tfmorris@gmail.com wrote:
Can you explain what "indexing" means in this context? Is there some type of matching process? How are duplicates resolved, if at all? Was the Wikidata info extracted from a dump or one of the APIs?
When I looked at the first person I picked at random, Pierre Berdoy (ID:269710), I see that both Wikidata and Wikipedia claim that he was born in Biarritz while the NYPL database claims he was born in Nashua, NH. So, it would appear that there are either two different people with the same name, born in different places, or the birth place is wrong.
http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... https://www.wikidata.org/wiki/Q3383941
Tom
On Tue, Dec 8, 2015 at 7:10 PM, David Lowe davidlowe@nypl.org wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Thanks, André! I don't know that I've found that before. Great to get country (or region) specific lists like this. D
On Wednesday, December 9, 2015, André Costa andre.costa@wikimedia.se wrote:
In case you haven't come across it before http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry point to the Swedish database of (primarily early) photographers curated by the Nordic Museum in Stockholm.
It's not that well integrated into Wikidata yet but the plan is to fix that during early 2016. That would also allow a variety of photographs on Wikimedia Commons to be linked to these entries.
Cheers, André
André Costa | GLAM developer, Wikimedia Sverige | Andre.Costa@wikimedia.se javascript:_e(%7B%7D,'cvml','Andre.Costa@wikimedia.se'); | +46 (0)733-964574
Stöd fri kunskap, bli medlem i Wikimedia Sverige. Läs mer på blimedlem.wikimedia.se
On 9 December 2015 at 02:44, David Lowe <davidlowe@nypl.org javascript:_e(%7B%7D,'cvml','davidlowe@nypl.org');> wrote:
Thanks, Tom. I'll have to look at this specific case when I'm back at work tomorrow, as it does seem you found something in error. As for my process: with WD, I queried out the label, description & country of citizenship, dob & dod of of everyone with occupation: photographer. After some cleaning, I can get the WD data formatted like my own (Name, Nationality, Dates). I can then do a simple match, where everything matches exactly. For the remainder, I then match names and dates- without Nationality, which is often very "soft" information. For those that pass a smell test (one is "English" the other is "British") I pass those along, too. For those with greater discrepancies, I look still closer. For those with still greater discrepancies, I manually, individually query my database for anyone with the same last name & same first initial to catch misspellings or different transliterations. I also occasionally put my entire database into open refine to catch instances where, for instance, a Chinese name has been given as FamilyName, GivenName in one source, and GivenName, FamilyName in another. In short, this is scrupulously- and manually- checked data. I'm not savvy enough to let an algorithm make my mistakes for me! But let me know if this seems to be more than bad luck of the draw- finding the conflicting data you found. I have also to say, I may suppress the Niepce Museum collection, as it's from a really crappy list of photographers in their collection which I found many years ago, and can no longer find. I don't want to blame them for the discrepancy, but that might be the source. I don't know. As I start to query out places of birth & death from WD in the next days, I expect to find more discrepancies. (Just today, I found dozens of folks whom ULAN gendered one way, and WD another- but were undeniably the same photographer. ) Thanks, David
On Tuesday, December 8, 2015, Tom Morris <tfmorris@gmail.com javascript:_e(%7B%7D,'cvml','tfmorris@gmail.com');> wrote:
Can you explain what "indexing" means in this context? Is there some type of matching process? How are duplicates resolved, if at all? Was the Wikidata info extracted from a dump or one of the APIs?
When I looked at the first person I picked at random, Pierre Berdoy (ID:269710), I see that both Wikidata and Wikipedia claim that he was born in Biarritz while the NYPL database claims he was born in Nashua, NH. So, it would appear that there are either two different people with the same name, born in different places, or the birth place is wrong.
http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... https://www.wikidata.org/wiki/Q3383941
Tom
On Tue, Dec 8, 2015 at 7:10 PM, David Lowe davidlowe@nypl.org wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Wikidata@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/wikidata
Happy to be of use. There is also one for: * Swedish photo studios [1] * Norwegian photographers[2] * Norwegian photo studios [3] I'm less familiar with these though and don't have a timeline for wikidata integration.
Cheers, André
[1] http://kulturnav.org/deb494a0-5457-4e5f-ae9b-e1826e0de681 [2] http://kulturnav.org/508197af-6e36-4e4f-927c-79f8f63654b2 [3] http://kulturnav.org/7d2a01d1-724c-4ad2-a18c-e799880a0241 ------ André Costa GLAM developer Wikimedia Sverige On 9 Dec 2015 15:07, "David Lowe" davidlowe@nypl.org wrote:
Thanks, André! I don't know that I've found that before. Great to get country (or region) specific lists like this. D
On Wednesday, December 9, 2015, André Costa andre.costa@wikimedia.se wrote:
In case you haven't come across it before http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry point to the Swedish database of (primarily early) photographers curated by the Nordic Museum in Stockholm.
It's not that well integrated into Wikidata yet but the plan is to fix that during early 2016. That would also allow a variety of photographs on Wikimedia Commons to be linked to these entries.
Cheers, André
André Costa | GLAM developer, Wikimedia Sverige | Andre.Costa@wikimedia.se | +46 (0)733-964574
Stöd fri kunskap, bli medlem i Wikimedia Sverige. Läs mer på blimedlem.wikimedia.se
On 9 December 2015 at 02:44, David Lowe davidlowe@nypl.org wrote:
Thanks, Tom. I'll have to look at this specific case when I'm back at work tomorrow, as it does seem you found something in error. As for my process: with WD, I queried out the label, description & country of citizenship, dob & dod of of everyone with occupation: photographer. After some cleaning, I can get the WD data formatted like my own (Name, Nationality, Dates). I can then do a simple match, where everything matches exactly. For the remainder, I then match names and dates- without Nationality, which is often very "soft" information. For those that pass a smell test (one is "English" the other is "British") I pass those along, too. For those with greater discrepancies, I look still closer. For those with still greater discrepancies, I manually, individually query my database for anyone with the same last name & same first initial to catch misspellings or different transliterations. I also occasionally put my entire database into open refine to catch instances where, for instance, a Chinese name has been given as FamilyName, GivenName in one source, and GivenName, FamilyName in another. In short, this is scrupulously- and manually- checked data. I'm not savvy enough to let an algorithm make my mistakes for me! But let me know if this seems to be more than bad luck of the draw- finding the conflicting data you found. I have also to say, I may suppress the Niepce Museum collection, as it's from a really crappy list of photographers in their collection which I found many years ago, and can no longer find. I don't want to blame them for the discrepancy, but that might be the source. I don't know. As I start to query out places of birth & death from WD in the next days, I expect to find more discrepancies. (Just today, I found dozens of folks whom ULAN gendered one way, and WD another- but were undeniably the same photographer. ) Thanks, David
On Tuesday, December 8, 2015, Tom Morris tfmorris@gmail.com wrote:
Can you explain what "indexing" means in this context? Is there some type of matching process? How are duplicates resolved, if at all? Was the Wikidata info extracted from a dump or one of the APIs?
When I looked at the first person I picked at random, Pierre Berdoy (ID:269710), I see that both Wikidata and Wikipedia claim that he was born in Biarritz while the NYPL database claims he was born in Nashua, NH. So, it would appear that there are either two different people with the same name, born in different places, or the birth place is wrong.
http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... https://www.wikidata.org/wiki/Q3383941
Tom
On Tue, Dec 8, 2015 at 7:10 PM, David Lowe davidlowe@nypl.org wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I think the Norwegian lists are a subset of Preus Photo Museums list. It is now maintained partly by Nasjonalbiblioteket (the Norwegian one, not the Swedish one) and Norsk Lokalhistorisk Institutt. For examle; Anders Beer Wilse in nowiki,[1] at Lokalhistoriewiki,[2] and at Nasjonalbiblioteket.[3]
Kulturnav is a kind of maintained ontology, where most of the work is done by local museums. The software for the site itself is made (in part) by a grant from Norsk Kulturråd.
We should connect as much as possible of our resources to resources at Kulturnav, and not just copy data. That said, we don't have a very good model for hov to materialize data from external sites and make it available for our client sites, so our option is more or less just to copy. It is better to maintain data at one location.
[1] https://no.wikipedia.org/wiki/Anders_Beer_Wilse [2] https://lokalhistoriewiki.no/index.php/Anders_Beer_Wilse [3] http://www.nb.no/nmff/fotograf.php?fotograf_id=3050
On Wed, Dec 9, 2015 at 9:51 PM, André Costa andre.costa@wikimedia.se wrote:
Happy to be of use. There is also one for:
- Swedish photo studios [1]
- Norwegian photographers[2]
- Norwegian photo studios [3]
I'm less familiar with these though and don't have a timeline for wikidata integration.
Cheers, André
[1] http://kulturnav.org/deb494a0-5457-4e5f-ae9b-e1826e0de681 [2] http://kulturnav.org/508197af-6e36-4e4f-927c-79f8f63654b2 [3] http://kulturnav.org/7d2a01d1-724c-4ad2-a18c-e799880a0241
André Costa GLAM developer Wikimedia Sverige On 9 Dec 2015 15:07, "David Lowe" davidlowe@nypl.org wrote:
Thanks, André! I don't know that I've found that before. Great to get country (or region) specific lists like this. D
On Wednesday, December 9, 2015, André Costa andre.costa@wikimedia.se wrote:
In case you haven't come across it before http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry point to the Swedish database of (primarily early) photographers curated by the Nordic Museum in Stockholm.
It's not that well integrated into Wikidata yet but the plan is to fix that during early 2016. That would also allow a variety of photographs on Wikimedia Commons to be linked to these entries.
Cheers, André
André Costa | GLAM developer, Wikimedia Sverige | Andre.Costa@wikimedia.se | +46 (0)733-964574
Stöd fri kunskap, bli medlem i Wikimedia Sverige. Läs mer på blimedlem.wikimedia.se
On 9 December 2015 at 02:44, David Lowe davidlowe@nypl.org wrote:
Thanks, Tom. I'll have to look at this specific case when I'm back at work tomorrow, as it does seem you found something in error. As for my process: with WD, I queried out the label, description & country of citizenship, dob & dod of of everyone with occupation: photographer. After some cleaning, I can get the WD data formatted like my own (Name, Nationality, Dates). I can then do a simple match, where everything matches exactly. For the remainder, I then match names and dates- without Nationality, which is often very "soft" information. For those that pass a smell test (one is "English" the other is "British") I pass those along, too. For those with greater discrepancies, I look still closer. For those with still greater discrepancies, I manually, individually query my database for anyone with the same last name & same first initial to catch misspellings or different transliterations. I also occasionally put my entire database into open refine to catch instances where, for instance, a Chinese name has been given as FamilyName, GivenName in one source, and GivenName, FamilyName in another. In short, this is scrupulously- and manually- checked data. I'm not savvy enough to let an algorithm make my mistakes for me! But let me know if this seems to be more than bad luck of the draw- finding the conflicting data you found. I have also to say, I may suppress the Niepce Museum collection, as it's from a really crappy list of photographers in their collection which I found many years ago, and can no longer find. I don't want to blame them for the discrepancy, but that might be the source. I don't know. As I start to query out places of birth & death from WD in the next days, I expect to find more discrepancies. (Just today, I found dozens of folks whom ULAN gendered one way, and WD another- but were undeniably the same photographer. ) Thanks, David
On Tuesday, December 8, 2015, Tom Morris tfmorris@gmail.com wrote:
Can you explain what "indexing" means in this context? Is there some type of matching process? How are duplicates resolved, if at all? Was the Wikidata info extracted from a dump or one of the APIs?
When I looked at the first person I picked at random, Pierre Berdoy (ID:269710), I see that both Wikidata and Wikipedia claim that he was born in Biarritz while the NYPL database claims he was born in Nashua, NH. So, it would appear that there are either two different people with the same name, born in different places, or the birth place is wrong.
http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... https://www.wikidata.org/wiki/Q3383941
Tom
On Tue, Dec 8, 2015 at 7:10 PM, David Lowe davidlowe@nypl.org wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Forgot to mention; Anders Beer Wilse in Kulturnav http://kulturnav.org/2b94216b-f2fc-46a3-b2ce-eeb93aa19185
On Wed, Dec 9, 2015 at 11:19 PM, John Erling Blad jeblad@gmail.com wrote:
I think the Norwegian lists are a subset of Preus Photo Museums list. It is now maintained partly by Nasjonalbiblioteket (the Norwegian one, not the Swedish one) and Norsk Lokalhistorisk Institutt. For examle; Anders Beer Wilse in nowiki,[1] at Lokalhistoriewiki,[2] and at Nasjonalbiblioteket.[3]
Kulturnav is a kind of maintained ontology, where most of the work is done by local museums. The software for the site itself is made (in part) by a grant from Norsk Kulturråd.
We should connect as much as possible of our resources to resources at Kulturnav, and not just copy data. That said, we don't have a very good model for hov to materialize data from external sites and make it available for our client sites, so our option is more or less just to copy. It is better to maintain data at one location.
[1] https://no.wikipedia.org/wiki/Anders_Beer_Wilse [2] https://lokalhistoriewiki.no/index.php/Anders_Beer_Wilse [3] http://www.nb.no/nmff/fotograf.php?fotograf_id=3050
On Wed, Dec 9, 2015 at 9:51 PM, André Costa andre.costa@wikimedia.se wrote:
Happy to be of use. There is also one for:
- Swedish photo studios [1]
- Norwegian photographers[2]
- Norwegian photo studios [3]
I'm less familiar with these though and don't have a timeline for wikidata integration.
Cheers, André
[1] http://kulturnav.org/deb494a0-5457-4e5f-ae9b-e1826e0de681 [2] http://kulturnav.org/508197af-6e36-4e4f-927c-79f8f63654b2 [3] http://kulturnav.org/7d2a01d1-724c-4ad2-a18c-e799880a0241
André Costa GLAM developer Wikimedia Sverige On 9 Dec 2015 15:07, "David Lowe" davidlowe@nypl.org wrote:
Thanks, André! I don't know that I've found that before. Great to get country (or region) specific lists like this. D
On Wednesday, December 9, 2015, André Costa andre.costa@wikimedia.se wrote:
In case you haven't come across it before http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry point to the Swedish database of (primarily early) photographers curated by the Nordic Museum in Stockholm.
It's not that well integrated into Wikidata yet but the plan is to fix that during early 2016. That would also allow a variety of photographs on Wikimedia Commons to be linked to these entries.
Cheers, André
André Costa | GLAM developer, Wikimedia Sverige | Andre.Costa@wikimedia.se | +46 (0)733-964574
Stöd fri kunskap, bli medlem i Wikimedia Sverige. Läs mer på blimedlem.wikimedia.se
On 9 December 2015 at 02:44, David Lowe davidlowe@nypl.org wrote:
Thanks, Tom. I'll have to look at this specific case when I'm back at work tomorrow, as it does seem you found something in error. As for my process: with WD, I queried out the label, description & country of citizenship, dob & dod of of everyone with occupation: photographer. After some cleaning, I can get the WD data formatted like my own (Name, Nationality, Dates). I can then do a simple match, where everything matches exactly. For the remainder, I then match names and dates- without Nationality, which is often very "soft" information. For those that pass a smell test (one is "English" the other is "British") I pass those along, too. For those with greater discrepancies, I look still closer. For those with still greater discrepancies, I manually, individually query my database for anyone with the same last name & same first initial to catch misspellings or different transliterations. I also occasionally put my entire database into open refine to catch instances where, for instance, a Chinese name has been given as FamilyName, GivenName in one source, and GivenName, FamilyName in another. In short, this is scrupulously- and manually- checked data. I'm not savvy enough to let an algorithm make my mistakes for me! But let me know if this seems to be more than bad luck of the draw- finding the conflicting data you found. I have also to say, I may suppress the Niepce Museum collection, as it's from a really crappy list of photographers in their collection which I found many years ago, and can no longer find. I don't want to blame them for the discrepancy, but that might be the source. I don't know. As I start to query out places of birth & death from WD in the next days, I expect to find more discrepancies. (Just today, I found dozens of folks whom ULAN gendered one way, and WD another- but were undeniably the same photographer. ) Thanks, David
On Tuesday, December 8, 2015, Tom Morris tfmorris@gmail.com wrote:
Can you explain what "indexing" means in this context? Is there some type of matching process? How are duplicates resolved, if at all? Was the Wikidata info extracted from a dump or one of the APIs?
When I looked at the first person I picked at random, Pierre Berdoy (ID:269710), I see that both Wikidata and Wikipedia claim that he was born in Biarritz while the NYPL database claims he was born in Nashua, NH. So, it would appear that there are either two different people with the same name, born in different places, or the birth place is wrong.
http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... https://www.wikidata.org/wiki/Q3383941
Tom
On Tue, Dec 8, 2015 at 7:10 PM, David Lowe davidlowe@nypl.org wrote:
> Hello all, > The Photographers' Identities Catalog (PIC) is an ongoing project of > visualizing photo history through the lives of photographers and photo > studios. I have information on 115,000 photographers and studios as of > tonight. It is still under construction, but as I've almost completed an > initial indexing of the ~12,000 photographers in WikiData, I thought I'd > share it with you. We (the New York Public Library) hope to launch it > officially in mid to late January. This represents about 12 years worth of > my work of researching in NYPL's photography collection, censuses and > business directories, and scraping or indexing trusted websites, databases, > and published biographical dictionaries pertaining to photo history. > Again, please bear in mind that our programmer is still hard at work > (and I continue to refine and add to the data*), but we welcome your > feedback, questions, critiques, etc. To see the WikiData photographers, > select WikiData from the Source dropdown. Have fun! > > *PIC* > http://mgiraldo.github.io/pic/?address.AddressTypeID=*&address.CountryID=*&Nationality=*&gender.TermID=*&process.TermID=*&role.TermID=*&format.TermID=*&biography.TermID=*&collection.TermID=*&Location=*&DisplayName=*&Date=* > > Thanks, > David > > *Tomorrow, for instance, I'll start mining Wikidata for birth & > death locations. > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > >
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Yep, thanks! Wilse there in duplicate (here's http://mgiraldo.github.io/pic/?address.AddressTypeID=*&address.CountryID=*&Nationality=*&gender.TermID=*&process.TermID=*&role.TermID=*&format.TermID=*&biography.TermID=*&collection.TermID=*&Location=*&DisplayName=(Anders~1%20AND%20Beer~1%20AND%20Wilse~1)&Date=* the correct one). The other will be gone in an hour or so when I update. I look forward to looking at these lists, thanks! It will probably next week before I finish ingesting birth & death locations from WD.
d
On Wed, Dec 9, 2015 at 5:27 PM, John Erling Blad jeblad@gmail.com wrote:
Forgot to mention; Anders Beer Wilse in Kulturnav http://kulturnav.org/2b94216b-f2fc-46a3-b2ce-eeb93aa19185
On Wed, Dec 9, 2015 at 11:19 PM, John Erling Blad jeblad@gmail.com wrote:
I think the Norwegian lists are a subset of Preus Photo Museums list. It is now maintained partly by Nasjonalbiblioteket (the Norwegian one, not the Swedish one) and Norsk Lokalhistorisk Institutt. For examle; Anders Beer Wilse in nowiki,[1] at Lokalhistoriewiki,[2] and at Nasjonalbiblioteket.[3]
Kulturnav is a kind of maintained ontology, where most of the work is done by local museums. The software for the site itself is made (in part) by a grant from Norsk Kulturråd.
We should connect as much as possible of our resources to resources at Kulturnav, and not just copy data. That said, we don't have a very good model for hov to materialize data from external sites and make it available for our client sites, so our option is more or less just to copy. It is better to maintain data at one location.
[1] https://no.wikipedia.org/wiki/Anders_Beer_Wilse [2] https://lokalhistoriewiki.no/index.php/Anders_Beer_Wilse [3] http://www.nb.no/nmff/fotograf.php?fotograf_id=3050
On Wed, Dec 9, 2015 at 9:51 PM, André Costa andre.costa@wikimedia.se wrote:
Happy to be of use. There is also one for:
- Swedish photo studios [1]
- Norwegian photographers[2]
- Norwegian photo studios [3]
I'm less familiar with these though and don't have a timeline for wikidata integration.
Cheers, André
[1] http://kulturnav.org/deb494a0-5457-4e5f-ae9b-e1826e0de681 [2] http://kulturnav.org/508197af-6e36-4e4f-927c-79f8f63654b2 [3] http://kulturnav.org/7d2a01d1-724c-4ad2-a18c-e799880a0241
André Costa GLAM developer Wikimedia Sverige On 9 Dec 2015 15:07, "David Lowe" davidlowe@nypl.org wrote:
Thanks, André! I don't know that I've found that before. Great to get country (or region) specific lists like this. D
On Wednesday, December 9, 2015, André Costa andre.costa@wikimedia.se wrote:
In case you haven't come across it before http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry point to the Swedish database of (primarily early) photographers curated by the Nordic Museum in Stockholm.
It's not that well integrated into Wikidata yet but the plan is to fix that during early 2016. That would also allow a variety of photographs on Wikimedia Commons to be linked to these entries.
Cheers, André
André Costa | GLAM developer, Wikimedia Sverige | Andre.Costa@wikimedia.se | +46 (0)733-964574
Stöd fri kunskap, bli medlem i Wikimedia Sverige. Läs mer på blimedlem.wikimedia.se
On 9 December 2015 at 02:44, David Lowe davidlowe@nypl.org wrote:
Thanks, Tom. I'll have to look at this specific case when I'm back at work tomorrow, as it does seem you found something in error. As for my process: with WD, I queried out the label, description & country of citizenship, dob & dod of of everyone with occupation: photographer. After some cleaning, I can get the WD data formatted like my own (Name, Nationality, Dates). I can then do a simple match, where everything matches exactly. For the remainder, I then match names and dates- without Nationality, which is often very "soft" information. For those that pass a smell test (one is "English" the other is "British") I pass those along, too. For those with greater discrepancies, I look still closer. For those with still greater discrepancies, I manually, individually query my database for anyone with the same last name & same first initial to catch misspellings or different transliterations. I also occasionally put my entire database into open refine to catch instances where, for instance, a Chinese name has been given as FamilyName, GivenName in one source, and GivenName, FamilyName in another. In short, this is scrupulously- and manually- checked data. I'm not savvy enough to let an algorithm make my mistakes for me! But let me know if this seems to be more than bad luck of the draw- finding the conflicting data you found. I have also to say, I may suppress the Niepce Museum collection, as it's from a really crappy list of photographers in their collection which I found many years ago, and can no longer find. I don't want to blame them for the discrepancy, but that might be the source. I don't know. As I start to query out places of birth & death from WD in the next days, I expect to find more discrepancies. (Just today, I found dozens of folks whom ULAN gendered one way, and WD another- but were undeniably the same photographer. ) Thanks, David
On Tuesday, December 8, 2015, Tom Morris tfmorris@gmail.com wrote:
> Can you explain what "indexing" means in this context? Is there > some type of matching process? How are duplicates resolved, if at all? Was > the Wikidata info extracted from a dump or one of the APIs? > > When I looked at the first person I picked at random, Pierre Berdoy > (ID:269710), I see that both Wikidata and Wikipedia claim that he was born > in Biarritz while the NYPL database claims he was born in Nashua, NH. So, > it would appear that there are either two different people with the same > name, born in different places, or the birth place is wrong. > > > http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=26... > https://www.wikidata.org/wiki/Q3383941 > > Tom > > > > > On Tue, Dec 8, 2015 at 7:10 PM, David Lowe davidlowe@nypl.org > wrote: > >> Hello all, >> The Photographers' Identities Catalog (PIC) is an ongoing project >> of visualizing photo history through the lives of photographers and photo >> studios. I have information on 115,000 photographers and studios as of >> tonight. It is still under construction, but as I've almost completed an >> initial indexing of the ~12,000 photographers in WikiData, I thought I'd >> share it with you. We (the New York Public Library) hope to launch it >> officially in mid to late January. This represents about 12 years worth of >> my work of researching in NYPL's photography collection, censuses and >> business directories, and scraping or indexing trusted websites, databases, >> and published biographical dictionaries pertaining to photo history. >> Again, please bear in mind that our programmer is still hard at >> work (and I continue to refine and add to the data*), but we welcome your >> feedback, questions, critiques, etc. To see the WikiData photographers, >> select WikiData from the Source dropdown. Have fun! >> >> *PIC* >> http://mgiraldo.github.io/pic/?address.AddressTypeID=*&address.CountryID=*&Nationality=*&gender.TermID=*&process.TermID=*&role.TermID=*&format.TermID=*&biography.TermID=*&collection.TermID=*&Location=*&DisplayName=*&Date=* >> >> Thanks, >> David >> >> *Tomorrow, for instance, I'll start mining Wikidata for birth & >> death locations. >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> >> > _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Since no one mentioned it, there is a tool to do the matching to WD much more efficiently: https://tools.wmflabs.org/mix-n-match/
On Wed, 9 Dec 2015, 01:10 David Lowe davidlowe@nypl.org wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations. _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Magnus Manske, 13/12/2015 11:24:
Since no one mentioned it, there is a tool to do the matching to WD much more efficiently: https://tools.wmflabs.org/mix-n-match/
+1
I'm planning to bring a few of the datasets into mix'n'match (@Magnus this is the one I asked sbout on Twitter) in January but not all of them are suitable and I believe separating KulturNav into multiple datasets on mix'n'match maxes more sense and makes it more likely that they get matched.
Some of the early adopters of KulturNav have been working with WMSE to facilitate bi-directional matching. This is done on a dataset-by-dataset level since different institutions are responsible for different datasets. My hope is that mix'n'match will help in this area as well, even as a tool for the institutions own staff who are often interested in matching entries to Wikipedia (which most of the time means wikidata).
@John: There are processes for matching kulturnav identifiers to wikidata entities. Only afterwards are details imported. Mainly to source statements [1] and [2]. There is some (not so user friendly) stats at [3].
Cheers, André
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_2 [2] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_3 [3] https://tools.wmflabs.org/lp-tools/misc/data/ ------ André Costa GLAM developer Wikimedia Sverige
Magnus Manske, 13/12/2015 11:24:
Since no one mentioned it, there is a tool to do the matching to WD much more efficiently: https://tools.wmflabs.org/mix-n-match/
https://tools.wmflabs.org/mix-n-match/
+1
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
There are some pretty good methods for optimizing the match process, but I have not seen any implementation for that against Wikidata items. Only things I've seen are some opportunistic methods. Duck tests gone wrong, or "Darn it was a platypus!"
On Mon, Dec 14, 2015 at 11:19 PM, André Costa andre.costa@wikimedia.se wrote:
I'm planning to bring a few of the datasets into mix'n'match (@Magnus this is the one I asked sbout on Twitter) in January but not all of them are suitable and I believe separating KulturNav into multiple datasets on mix'n'match maxes more sense and makes it more likely that they get matched.
Some of the early adopters of KulturNav have been working with WMSE to facilitate bi-directional matching. This is done on a dataset-by-dataset level since different institutions are responsible for different datasets. My hope is that mix'n'match will help in this area as well, even as a tool for the institutions own staff who are often interested in matching entries to Wikipedia (which most of the time means wikidata).
@John: There are processes for matching kulturnav identifiers to wikidata entities. Only afterwards are details imported. Mainly to source statements [1] and [2]. There is some (not so user friendly) stats at [3].
Cheers, André
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_2 [2] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_3 [3] https://tools.wmflabs.org/lp-tools/misc/data/
André Costa GLAM developer Wikimedia Sverige
Magnus Manske, 13/12/2015 11:24:
Since no one mentioned it, there is a tool to do the matching to WD much more efficiently: https://tools.wmflabs.org/mix-n-match/
https://tools.wmflabs.org/mix-n-match/
+1
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, Sorry, I understand sarcasm but I do not understand what it is based upon. Thanks, GerardM
On 15 December 2015 at 20:10, John Erling Blad jeblad@gmail.com wrote:
There are some pretty good methods for optimizing the match process, but I have not seen any implementation for that against Wikidata items. Only things I've seen are some opportunistic methods. Duck tests gone wrong, or "Darn it was a platypus!"
On Mon, Dec 14, 2015 at 11:19 PM, André Costa andre.costa@wikimedia.se wrote:
I'm planning to bring a few of the datasets into mix'n'match (@Magnus this is the one I asked sbout on Twitter) in January but not all of them are suitable and I believe separating KulturNav into multiple datasets on mix'n'match maxes more sense and makes it more likely that they get matched.
Some of the early adopters of KulturNav have been working with WMSE to facilitate bi-directional matching. This is done on a dataset-by-dataset level since different institutions are responsible for different datasets. My hope is that mix'n'match will help in this area as well, even as a tool for the institutions own staff who are often interested in matching entries to Wikipedia (which most of the time means wikidata).
@John: There are processes for matching kulturnav identifiers to wikidata entities. Only afterwards are details imported. Mainly to source statements [1] and [2]. There is some (not so user friendly) stats at [3].
Cheers, André
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_2 [2] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_3 [3] https://tools.wmflabs.org/lp-tools/misc/data/
André Costa GLAM developer Wikimedia Sverige
Magnus Manske, 13/12/2015 11:24:
Since no one mentioned it, there is a tool to do the matching to WD much more efficiently: https://tools.wmflabs.org/mix-n-match/
https://tools.wmflabs.org/mix-n-match/
+1
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi all,
This is an old thread now, but I thought I'd update you that NYPL has now launched Photographers' Identities Catalog. Read about it here http://www.nypl.org/blog/2016/03/25/introducing-pic, or skip straight to the site at pic.nypl.org . I hope it may be of interest to some of you. Thanks, David
On Tue, Dec 15, 2015 at 3:55 PM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Sorry, I understand sarcasm but I do not understand what it is based upon. Thanks, GerardM
On 15 December 2015 at 20:10, John Erling Blad jeblad@gmail.com wrote:
There are some pretty good methods for optimizing the match process, but I have not seen any implementation for that against Wikidata items. Only things I've seen are some opportunistic methods. Duck tests gone wrong, or "Darn it was a platypus!"
On Mon, Dec 14, 2015 at 11:19 PM, André Costa andre.costa@wikimedia.se wrote:
I'm planning to bring a few of the datasets into mix'n'match (@Magnus this is the one I asked sbout on Twitter) in January but not all of them are suitable and I believe separating KulturNav into multiple datasets on mix'n'match maxes more sense and makes it more likely that they get matched.
Some of the early adopters of KulturNav have been working with WMSE to facilitate bi-directional matching. This is done on a dataset-by-dataset level since different institutions are responsible for different datasets. My hope is that mix'n'match will help in this area as well, even as a tool for the institutions own staff who are often interested in matching entries to Wikipedia (which most of the time means wikidata).
@John: There are processes for matching kulturnav identifiers to wikidata entities. Only afterwards are details imported. Mainly to source statements [1] and [2]. There is some (not so user friendly) stats at [3].
Cheers, André
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_2 [2] https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_3 [3] https://tools.wmflabs.org/lp-tools/misc/data/
André Costa GLAM developer Wikimedia Sverige
Magnus Manske, 13/12/2015 11:24:
Since no one mentioned it, there is a tool to do the matching to WD
much
more efficiently: https://tools.wmflabs.org/mix-n-match/
https://tools.wmflabs.org/mix-n-match/
+1
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 3/25/16 5:00 PM, David Lowe wrote:
Hi all,
This is an old thread now, but I thought I'd update you that NYPL has now launched Photographers' Identities Catalog. Read about it here http://www.nypl.org/blog/2016/03/25/introducing-pic, or skip straight to the site at pic.nypl.org http://pic.nypl.org . I hope it may be of interest to some of you. Thanks, David
David,
Very nice!
I had to search a bit, but eventually found: http://on.nypl.org/25DhGDm .
Kingsley
On Tue, Dec 15, 2015 at 3:55 PM, Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com> wrote:
Hoi, Sorry, I understand sarcasm but I do not understand what it is based upon. Thanks, GerardM On 15 December 2015 at 20:10, John Erling Blad <jeblad@gmail.com <mailto:jeblad@gmail.com>> wrote: There are some pretty good methods for optimizing the match process, but I have not seen any implementation for that against Wikidata items. Only things I've seen are some opportunistic methods. Duck tests gone wrong, or "Darn it was a platypus!" On Mon, Dec 14, 2015 at 11:19 PM, André Costa <andre.costa@wikimedia.se <mailto:andre.costa@wikimedia.se>> wrote: I'm planning to bring a few of the datasets into mix'n'match (@Magnus this is the one I asked sbout on Twitter) in January but not all of them are suitable and I believe separating KulturNav into multiple datasets on mix'n'match maxes more sense and makes it more likely that they get matched. Some of the early adopters of KulturNav have been working with WMSE to facilitate bi-directional matching. This is done on a dataset-by-dataset level since different institutions are responsible for different datasets. My hope is that mix'n'match will help in this area as well, even as a tool for the institutions own staff who are often interested in matching entries to Wikipedia (which most of the time means wikidata). @John: There are processes for matching kulturnav identifiers to wikidata entities. Only afterwards are details imported. Mainly to source statements [1] and [2]. There is some (not so user friendly) stats at [3]. Cheers, André [1]https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_2 [2]https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_3 [3] https://tools.wmflabs.org/lp-tools/misc/data/ ------ André Costa GLAM developer Wikimedia Sverige Magnus Manske, 13/12/2015 11:24: > > Since no one mentioned it, there is a tool to do the matching to WD much > more efficiently: > https://tools.wmflabs.org/mix-n-match/ <https://tools.wmflabs.org/mix-n-match/> +1 _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Magnus, Yes, I'm curious to run the data through mix-n-match to see what it catches that I didn't (and vice versa). I've finished an initial matching and have all but about 700 names from WD to either enter into PIC, or reject (sorry Spiderman https://www.wikidata.org/wiki/Q79037, PIC is only for actual photographers). I just updated PIC with more links into Wikipedias http://mgiraldo.github.io/pic/?address.AddressTypeID=*&address.CountryID=*&Nationality=*&gender.TermID=*&process.TermID=*&role.TermID=*&format.TermID=*&biography.TermID=2028094&collection.TermID=*&Location=*&DisplayName=*&Date=* (almost 12,000 links now). Once PIC is actually launched and at a permanent url (January sometime), I'd love to get PIC ID #s into the WD records. d
On Sun, Dec 13, 2015 at 5:24 AM, Magnus Manske magnusmanske@googlemail.com wrote:
Since no one mentioned it, there is a tool to do the matching to WD much more efficiently: https://tools.wmflabs.org/mix-n-match/
On Wed, 9 Dec 2015, 01:10 David Lowe davidlowe@nypl.org wrote:
Hello all, The Photographers' Identities Catalog (PIC) is an ongoing project of visualizing photo history through the lives of photographers and photo studios. I have information on 115,000 photographers and studios as of tonight. It is still under construction, but as I've almost completed an initial indexing of the ~12,000 photographers in WikiData, I thought I'd share it with you. We (the New York Public Library) hope to launch it officially in mid to late January. This represents about 12 years worth of my work of researching in NYPL's photography collection, censuses and business directories, and scraping or indexing trusted websites, databases, and published biographical dictionaries pertaining to photo history. Again, please bear in mind that our programmer is still hard at work (and I continue to refine and add to the data*), but we welcome your feedback, questions, critiques, etc. To see the WikiData photographers, select WikiData from the Source dropdown. Have fun!
Thanks, David
*Tomorrow, for instance, I'll start mining Wikidata for birth & death locations. _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
David Lowe, 15/12/2015 00:00:
Once PIC is actually launched and at a permanent url (January sometime), I'd love to get PIC ID #s into the WD records.
Have you proposed the property yet? If not please do: https://www.wikidata.org/wiki/Wikidata:Property_proposal/Authority_control
Nemo
Nemo, Thanks, I will. There's not a good way currently to link to each of my entries* so it would be the IDs without links currently.
*You can use the PIC ID in place of the Name in the URL to link directly to the photographer, like this: http://pic.nypl.org/map/?address.AddressTypeID=*&address.CountryID=*&... *DisplayName=362978*&Date=*&mode=2
We plan to improve this in the future though.
Thanks, David
On Sun, Apr 3, 2016 at 3:46 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
David Lowe, 15/12/2015 00:00:
Once PIC is actually launched and at a permanent url (January sometime), I'd love to get PIC ID #s into the WD records.
Have you proposed the property yet? If not please do: https://www.wikidata.org/wiki/Wikidata:Property_proposal/Authority_control
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Also, I should have pointed out that I have 11,000 Wikidata http://pic.nypl.org/map/?address.AddressTypeID=*&address.CountryID=*&Nationality=*&gender.TermID=*&process.TermID=*&role.TermID=*&format.TermID=*&biography.TermID=2028247&collection.TermID=*&bbox=*&DisplayName=*&Date=*&mode=2 entities matched in PIC already. Is there a quick & easy way to get those in?
On Sun, Apr 3, 2016 at 11:04 AM, David Lowe davidlowe@nypl.org wrote:
Nemo, Thanks, I will. There's not a good way currently to link to each of my entries* so it would be the IDs without links currently.
*You can use the PIC ID in place of the Name in the URL to link directly to the photographer, like this:
http://pic.nypl.org/map/?address.AddressTypeID=*&address.CountryID=*&... *DisplayName=362978*&Date=*&mode=2
We plan to improve this in the future though.
Thanks, David
On Sun, Apr 3, 2016 at 3:46 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
David Lowe, 15/12/2015 00:00:
Once PIC is actually launched and at a permanent url (January sometime), I'd love to get PIC ID #s into the WD records.
Have you proposed the property yet? If not please do: https://www.wikidata.org/wiki/Wikidata:Property_proposal/Authority_control
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
David Lowe, 03/04/2016 17:10:
Also, I should have pointed out that I have 11,000 Wikidata http://pic.nypl.org/map/?address.AddressTypeID=*&address.CountryID=*&Nationality=*&gender.TermID=*&process.TermID=*&role.TermID=*&format.TermID=*&biography.TermID=2028247&collection.TermID=*&bbox=*&DisplayName=*&Date=*&mode=2 entities matched in PIC already. Is there a quick & easy way to get those in?
IIRC you said you already added those matches to mix-n-match. When the property exists, Magnus is usually very fast at importing the associations into Wikidata with his bot. :)
Nemo
Actually, Nemo, I haven't yet added them to mix-n-match (though we've discussed previously). I turned all my focus back to preparing the site for launch, thinking it better to engage WD once the site was actually live. I did just submit a proposal for a external ID property. I'll look at mix-n-match again tomorrow. Thanks, all! d
On Sun, Apr 3, 2016 at 11:47 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
David Lowe, 03/04/2016 17:10:
Also, I should have pointed out that I have 11,000 Wikidata < http://pic.nypl.org/map/?address.AddressTypeID=*&address.CountryID=*&...
entities matched in PIC already. Is there a quick & easy way to get those in?
IIRC you said you already added those matches to mix-n-match. When the property exists, Magnus is usually very fast at importing the associations into Wikidata with his bot. :)
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Magnus's QuickStatements tool http://tools.wmflabs.org/wikidata-todo/quick_statements.php can write about 5000 statements an hour at full tilt, so could add those 11,000 identifier values pretty quickly.
It looks like there's no problem slotting the PIC id into the relevant address to make a URL that works -- it would be no problem to specify this as the URL format for the property. And if/when a shorter form of URL were to become available it would be no problem to change to this, just by changing the URL formatter for the property.
-- James.
On 03/04/2016 17:42, David Lowe wrote:
Actually, Nemo, I haven't yet added them to mix-n-match (though we've discussed previously). I turned all my focus back to preparing the site for launch, thinking it better to engage WD once the site was actually live. I did just submit a proposal for a external ID property. I'll look at mix-n-match again tomorrow. Thanks, all! d
On Sun, Apr 3, 2016 at 11:47 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
David Lowe, 03/04/2016 17:10:
Also, I should have pointed out that I have 11,000 Wikidata < http://pic.nypl.org/map/?address.AddressTypeID=*&address.CountryID=*&...
entities matched in PIC already. Is there a quick & easy way to get those in?
IIRC you said you already added those matches to mix-n-match. When the property exists, Magnus is usually very fast at importing the associations into Wikidata with his bot. :)
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
James, That's great news, as it will satisfy some concerns on the Library's end about switching from our current URLs to stable URIs in the future. And Magnus (I know you're out there!), I'll reach out tomorrow to see how we may most conveniently progress. Thanks all, d
On Sun, Apr 3, 2016 at 5:03 PM, James Heald j.heald@ucl.ac.uk wrote:
Magnus's QuickStatements tool http://tools.wmflabs.org/wikidata-todo/quick_statements.php can write about 5000 statements an hour at full tilt, so could add those 11,000 identifier values pretty quickly.
It looks like there's no problem slotting the PIC id into the relevant address to make a URL that works -- it would be no problem to specify this as the URL format for the property. And if/when a shorter form of URL were to become available it would be no problem to change to this, just by changing the URL formatter for the property.
-- James.
On 03/04/2016 17:42, David Lowe wrote:
Actually, Nemo, I haven't yet added them to mix-n-match (though we've discussed previously). I turned all my focus back to preparing the site for launch, thinking it better to engage WD once the site was actually live. I did just submit a proposal for a external ID property. I'll look at mix-n-match again tomorrow. Thanks, all! d
On Sun, Apr 3, 2016 at 11:47 AM, Federico Leva (Nemo) <nemowiki@gmail.com
wrote:
David Lowe, 03/04/2016 17:10:
Also, I should have pointed out that I have 11,000 Wikidata
<
http://pic.nypl.org/map/?address.AddressTypeID=*&address.CountryID=*&...
entities matched in PIC already. Is there a quick & easy way to get
those in?
IIRC you said you already added those matches to mix-n-match. When the property exists, Magnus is usually very fast at importing the associations into Wikidata with his bot. :)
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
David Lowe, 03/04/2016 18:42:
Actually, Nemo, I haven't yet added them to mix-n-match (though we've discussed previously).
Ah, sorry. Then follow James' suggestion. I recommended QuickStatements to the national central library of Italy (BNCF) too, AFAIK it worked well for them. One advantage is that all edits are attributed to you in the history.
Nemo