Hi all,
I understand this is probably outside the scope of Wikidata-l, but I am looking for advice in what I am noticing is a somewhat re-occuring issue with VIAF identifiers.
For a lot of items, I am finding 2 or more VIAF numbers for BLP subjects.
Recently I've found up to 4 VIAF numbers: - https://www.wikidata.org/wiki/Q45311838 - https://www.wikidata.org/wiki/Q45311745
Is there coordination between Wikidata and VIAF that is automated and fixes things like this.
I don't want to remove some of these VIAF numbers as they are tied to "legit" authority bodies. I understand there is often a "main" VIAF number...
Thanks in advance for any assistance and/or advice with this.
- Erika
*Erika Herzog* Wikipedia *User:BrillLyle https://en.wikipedia.org/wiki/User:BrillLyle*
A no answer of your question about koordination between VIAF WIkidata
But I asked this question at Wikidata cons in Berlin
And got the answer if its duplicated add them and then we take away when its merged in VIAF
Who “we" are I dont know ;-)
Regards Magnus Sälgö Stockholm, Sweden User:Salgo60 https://www.wikidata.org/wiki/User:Salgo60 https://twitter.com/salgo60 https://twitter.com/salgo60
On 12 Dec 2017, at 11:49, Brill Lyle <wp.brilllyle@gmail.com mailto:wp.brilllyle@gmail.com> wrote:
Hi all,
I understand this is probably outside the scope of Wikidata-l, but I am looking for advice in what I am noticing is a somewhat re-occuring issue with VIAF identifiers.
For a lot of items, I am finding 2 or more VIAF numbers for BLP subjects.
Recently I've found up to 4 VIAF numbers:
- https://www.wikidata.org/wiki/Q45311838 https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FQ45311838&data=02%7C01%7C%7C676e17415f994e53c00c08d5414e0b0b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636486725800186666&sdata=fHo1kJ0wmyoLtT50tdQj%2FgsDmQlvVN2ZCjXhIUAxeog%3D&reserved=0
- https://www.wikidata.org/wiki/Q45311745 https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FQ45311745&data=02%7C01%7C%7C676e17415f994e53c00c08d5414e0b0b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636486725800186666&sdata=eAxW1T092KpXRIO3fZ1b8xHlMPFuEN6cGpNucR3pBrg%3D&reserved=0
Is there coordination between Wikidata and VIAF that is automated and fixes things like this.
I don't want to remove some of these VIAF numbers as they are tied to "legit" authority bodies. I understand there is often a "main" VIAF number...
Thanks in advance for any assistance and/or advice with this.
- Erika
Erika Herzog Wikipedia User:BrillLyle https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FUser%3ABrillLyle&data=02%7C01%7C%7C676e17415f994e53c00c08d5414e0b0b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636486725800186666&sdata=%2FLLADcx2LAsqJ1aixismvBzLZp9LjN5qKkiSpyHQvJE%3D&reserved=0_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.wiki...
Yes, I think you should add all VIAFs. That is what I have done. I suppose the VIAF people might discover mergeable VIAFs and merge them. On Wikidata the Ivan A. Krestinin's bot at https://www.wikidata.org/wiki/User:KrBot seems to operate and clear things up if VIAF merge, see e.g. the edits after the revision in this item:
https://www.wikidata.org/w/index.php?title=Q20984804&oldid=473484364#P21...
I don't know whether VIAF monitors Wikidata.
Finn Årup Nielsen http://people.compute.dtu.dk/faan/
On 12/12/2017 11:53 AM, MagnusSalgo wrote:
A no answer of your question about koordination between VIAF WIkidata
But I asked this question at Wikidata cons in Berlin
And got the answer if its duplicated add them and then we take away when its merged in VIAF
Who “we" are I dont know ;-)
Regards Magnus Sälgö Stockholm, Sweden User:Salgo60 https://www.wikidata.org/wiki/User:Salgo60 https://twitter.com/salgo60
On 12 Dec 2017, at 11:49, Brill Lyle <wp.brilllyle@gmail.com mailto:wp.brilllyle@gmail.com> wrote:
Hi all,
I understand this is probably outside the scope of Wikidata-l, but I am looking for advice in what I am noticing is a somewhat re-occuring issue with VIAF identifiers.
For a lot of items, I am finding 2 or more VIAF numbers for BLP subjects.
Recently I've found up to 4 VIAF numbers:
Is there coordination between Wikidata and VIAF that is automated and fixes things like this.
I don't want to remove some of these VIAF numbers as they are tied to "legit" authority bodies. I understand there is often a "main" VIAF number...
Thanks in advance for any assistance and/or advice with this.
- Erika
*Erika Herzog* Wikipedia *User:BrillLyle https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FUser%3ABrillLyle&data=02%7C01%7C%7C676e17415f994e53c00c08d5414e0b0b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636486725800186666&sdata=%2FLLADcx2LAsqJ1aixismvBzLZp9LjN5qKkiSpyHQvJE%3D&reserved=0* _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.wiki...
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Apologies for the delay in writing back. I want to thank everyone for the responses and additional information.
Ad nauseam, I know, but I'm obsessed with Authority control and VIAF is the major source for that (for me) so this discussion and help makes me really happy. Appreciate it.
*COMMENTS*
*A. VIAF deletions*
Thanks for the advice. I'm glad to have it --> I will NOT delete multiple VIAFs -- and will actually be even a bit more exhaustive when inputting VIAF metadata into Wikidata. Thanks for this feedback.
*B. VIAF bot*
Good to know there's a bot that does maintenance. Thanks Finn. #yay
*C. Wikidata into VIAF*
Seeing as I am creating a ton of VIAF identifiers for new Wikidata items, how do these new Wikidata values that aren't in VIAF get ingested? I am hoping there is a bot that does this. I would not enjoy sending a list to OCLC/VIAF, and suspect it is automated.
I am curious how this works and what the expected / typical timeframe of this process of ingesting new Wikidata items into existing VIAF (which don't have Wikidata yet) might be....
*D. Wikipedia:VIAF * https://en.wikipedia.org/wiki/Wikipedia:VIAF/errors Yikes. I like the Wikidata page much better, even if I'm not 100% understanding it. This is nightmarish....
For some reason I thought there was an email to directly send VIAF/OCLC errors.
*TOPICS*
*1. OCLC Wikipedian-in-Residence*
Yes, it was Max Klein (https://en.wikipedia.org/wiki/User:Notconfusing), the prior WiR, the person I think who was the contact person. https://www.oclc.org/research/news/2012/05-22.html
Yes, he moved on. I too, am curious who is functioning in the VIAF-contact role now. There has to be someone doing it now.
There is a new WiR Monika Sengul-Jones, but she is working on an OCLC/WebJunction video tutorial <shudder> project that is reaching out to public librarians in the U.S., co-funded by WMF and the Knight Foundation. I tried to reach out to her to do some Authority control+Wikidata WikiFacilitating but never got traction. I am probably too shrill when it comes to Wikidata -- and Authority control. Lost opportunity. Oh well.
*2. Single value violations*
https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violation... "Single_value"_violations
I don't understand this at all. Apologies, I feel very dumb but would appreciate maybe a general, non-tech speak explanation if someone might be so kind.
Thanks again!
- Erika
*Erika Herzog* Wikipedia *User:BrillLyle https://en.wikipedia.org/wiki/User:BrillLyle*
On Tue, Dec 12, 2017 at 2:29 PM, Finn Aarup Nielsen fn@imm.dtu.dk wrote:
Yes, I think you should add all VIAFs. That is what I have done. I suppose the VIAF people might discover mergeable VIAFs and merge them. On Wikidata the Ivan A. Krestinin's bot at https://www.wikidata.org/wiki/ User:KrBot seems to operate and clear things up if VIAF merge, see e.g. the edits after the revision in this item:
https://www.wikidata.org/w/index.php?title=Q20984804&oldid=473484364#P21...
I don't know whether VIAF monitors Wikidata.
*2. Single value violations*
https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violation... "Single_value"_violations
I don't understand this at all. Apologies, I feel very dumb but would appreciate maybe a general, non-tech speak explanation if someone might be so kind.
It's explained here: https://www.wikidata.org/wiki/Q19474404
It means that the "VIAF Identifier" P214 property can only have 1 value. But you can treat this as a warning really because you will be loading multiple VIAF identifiers for the same person...and getting VIAF to merge or cleanup later...which hopefully bots will then pickup and handle the extra identifiers being deleted..
-Thad +ThadGuidry https://plus.google.com/+ThadGuidry
VIAF identifies a variety of things, including people, pseudonyms, organizations, places, etc. There is no intentional distinction between those things and their “identity”.
Jeff
From: Wikidata wikidata-bounces@lists.wikimedia.org on behalf of "leadsong@webname.com" leadsong@webname.com Reply-To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Date: Thursday, December 14, 2017 at 11:14 AM To: "wikidata@lists.wikimedia.org" wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Identifiers: Multiple VIAF numbers
That's because P214 should be a property of an identity, not of a person. We conflate the two concepts at our peril. One person can have multiple identities, aliases, pseudonyms, characters, roles, jobs, etc. Sometimes the same identity is adopted by multiple people, either sequentially or simultaneously: "I am Spartacus!", "John Bull", "G.I. Joe", "POTUS", "The King is dead, long live the King!", "Editor in Chief of the New York Times", ... LeadSongDog Sent: Thursday, December 14, 2017 at 8:13 AM From: "Thad Guidry" thadguidry@gmail.com To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Identifiers: Multiple VIAF numbers
2. Single value violations
https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violation...https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P214"Single_value"_violations
I don't understand this at all. Apologies, I feel very dumb but would appreciate maybe a general, non-tech speak explanation if someone might be so kind.
It's explained here: https://www.wikidata.org/wiki/Q19474404
It means that the "VIAF Identifier" P214 property can only have 1 value. But you can treat this as a warning really because you will be loading multiple VIAF identifiers for the same person...and getting VIAF to merge or cleanup later...which hopefully bots will then pickup and handle the extra identifiers being deleted..
-Thad +ThadGuidryhttps://plus.google.com/+ThadGuidry
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I agree that “named entity” makes more sense. I’ll pass that edit suggestion along to the folks who maintain the page.
Here is a blog post from 2011 where we talk about remodeling VIAF into a coherent hub-and-spoke model revolving around the entity itself.
http://outgoing.typepad.com/outgoing/2011/04/changes-to-viafs-rdf.html
Jeff
From: Wikidata wikidata-bounces@lists.wikimedia.org on behalf of "leadsong@webname.com" leadsong@webname.com Reply-To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Date: Thursday, December 14, 2017 at 12:19 PM To: "wikidata@lists.wikimedia.org" wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Identifiers: Multiple VIAF numbers
Well, http://www.oclc.org/en/viaf.html says that VIAF
* Links national and regional-level authority records, creating a cluster record for each unique name (my bolding). Some clarification is needed, if the "name" is supposed to be "named entity" instead.
LeadSongDog
Sent: Thursday, December 14, 2017 at 11:20 AM From: "Young,Jeff (OR)" jyoung@oclc.org To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Identifiers: Multiple VIAF numbers VIAF identifies a variety of things, including people, pseudonyms, organizations, places, etc. There is no intentional distinction between those things and their “identity”.
Jeff
From: Wikidata wikidata-bounces@lists.wikimedia.org on behalf of "leadsong@webname.com" leadsong@webname.com Reply-To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Date: Thursday, December 14, 2017 at 11:14 AM To: "wikidata@lists.wikimedia.org" wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Identifiers: Multiple VIAF numbers
That's because P214 should be a property of an identity, not of a person. We conflate the two concepts at our peril. One person can have multiple identities, aliases, pseudonyms, characters, roles, jobs, etc. Sometimes the same identity is adopted by multiple people, either sequentially or simultaneously: "I am Spartacus!", "John Bull", "G.I. Joe", "POTUS", "The King is dead, long live the King!", "Editor in Chief of the New York Times", ... LeadSongDog Sent: Thursday, December 14, 2017 at 8:13 AM From: "Thad Guidry" thadguidry@gmail.com To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Subject: Re: [Wikidata] Identifiers: Multiple VIAF numbers
2. Single value violations
https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violation...https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P214"Single_value"_violations
I don't understand this at all. Apologies, I feel very dumb but would appreciate maybe a general, non-tech speak explanation if someone might be so kind.
It's explained here: https://www.wikidata.org/wiki/Q19474404
It means that the "VIAF Identifier" P214 property can only have 1 value. But you can treat this as a warning really because you will be loading multiple VIAF identifiers for the same person...and getting VIAF to merge or cleanup later...which hopefully bots will then pickup and handle the extra identifiers being deleted..
-Thad +ThadGuidryhttps://plus.google.com/+ThadGuidry
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Thanks Thad. That was very helpful.
So Jeff, if you aren't the contact from OCLC who works with VIAF, can you tell us who is, and if they have any plans to address this issue? :-)
Best,
- Erika
*Erika Herzog* Wikipedia *User:BrillLyle https://en.wikipedia.org/wiki/User:BrillLyle*
On Thu, Dec 14, 2017 at 12:32 PM, Young,Jeff (OR) jyoung@oclc.org wrote:
I agree that “named entity” makes more sense. I’ll pass that edit suggestion along to the folks who maintain the page.
Here is a blog post from 2011 where we talk about remodeling VIAF into a coherent hub-and-spoke model revolving around the entity itself.
http://outgoing.typepad.com/outgoing/2011/04/changes-to-viafs-rdf.html
Jeff
Hi Erika,
The team page needs to be updated, but I'll pass the thread along to them:
http://www.oclc.org/research/activities/viaf.html
Jeff ________________________________ From: Wikidata wikidata-bounces@lists.wikimedia.org on behalf of Brill Lyle wp.brilllyle@gmail.com Sent: Sunday, December 17, 2017 3:56:05 AM To: Discussion list for the Wikidata project. Subject: Re: [Wikidata] Identifiers: Multiple VIAF numbers
Thanks Thad. That was very helpful.
So Jeff, if you aren't the contact from OCLC who works with VIAF, can you tell us who is, and if they have any plans to address this issue? :-)
Best,
- Erika
Erika Herzog Wikipedia User:BrillLylehttps://en.wikipedia.org/wiki/User:BrillLyle
On Thu, Dec 14, 2017 at 12:32 PM, Young,Jeff (OR) <jyoung@oclc.orgmailto:jyoung@oclc.org> wrote:
I agree that “named entity” makes more sense. I’ll pass that edit suggestion along to the folks who maintain the page.
Here is a blog post from 2011 where we talk about remodeling VIAF into a coherent hub-and-spoke model revolving around the entity itself.
http://outgoing.typepad.com/outgoing/2011/04/changes-to-viafs-rdf.html
Jeff
Hi Jeff,
Thanks for the linkage. Very interesting.
Would you mind pinging me off list when/if the page is updated?
I will update the VIAF page on En Wikipedia, I think....
Thanks again!
- Erika
*Erika Herzog* Wikipedia *User:BrillLyle https://en.wikipedia.org/wiki/User:BrillLyle*
On Sun, Dec 17, 2017 at 10:09 AM, Young,Jeff (OR) jyoung@oclc.org wrote:
Hi Erika,
The team page needs to be updated, but I'll pass the thread along to them:
http://www.oclc.org/research/activities/viaf.html
Jeff
*From:* Wikidata wikidata-bounces@lists.wikimedia.org on behalf of Brill Lyle wp.brilllyle@gmail.com *Sent:* Sunday, December 17, 2017 3:56:05 AM *To:* Discussion list for the Wikidata project. *Subject:* Re: [Wikidata] Identifiers: Multiple VIAF numbers
Thanks Thad. That was very helpful.
So Jeff, if you aren't the contact from OCLC who works with VIAF, can you tell us who is, and if they have any plans to address this issue? :-)
Best,
- Erika
*Erika Herzog* Wikipedia *User:BrillLyle https://en.wikipedia.org/wiki/User:BrillLyle*
On Thu, Dec 14, 2017 at 12:32 PM, Young,Jeff (OR) jyoung@oclc.org wrote:
I agree that “named entity” makes more sense. I’ll pass that edit suggestion along to the folks who maintain the page.
Here is a blog post from 2011 where we talk about remodeling VIAF into a coherent hub-and-spoke model revolving around the entity itself.
http://outgoing.typepad.com/outgoing/2011/04/changes-to-viafs-rdf.html
Jeff
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Finn Aarup Nielsen, 12/12/2017 21:29:
Yes, I think you should add all VIAFs. That is what I have done.
When I edit an item manually, I usually also mark one of the identifiers as preferred. In Wikimedia Italia events I regularly mention this as one of the benefits of Wikidata: it's the only real "broker" of identifiers, where links and mistakes or duplicates can emerge.
Federico
2017-12-12 11:49 GMT+01:00 Brill Lyle wp.brilllyle@gmail.com:
Hi all,
I understand this is probably outside the scope of Wikidata-l, but I am looking for advice in what I am noticing is a somewhat re-occuring issue with VIAF identifiers.
For a lot of items, I am finding 2 or more VIAF numbers for BLP subjects.
Recently I've found up to 4 VIAF numbers:
Is there coordination between Wikidata and VIAF that is automated and fixes things like this.
I don't want to remove some of these VIAF numbers as they are tied to "legit" authority bodies. I understand there is often a "main" VIAF number...
Dear Erika,
AFAIK the problem should be solved VIAF-side. The problem is that most of the National Library Systems that give data to VIAF are *not* coordinated among themselves - in fact, VIAF was born to solve this problem, then came Wikidata that helps VIAF finding more and more duplicates in its database.
Until some time ago, there was a Wikimedian in residence taking care of all Wikidata (and all other Wikimedia)-related things, but I don't know if it's still the case or not. I think I remember that he left, and was replaced by somebody else, but then I don't know if we still have somebody there and/or there is this coordination, even if there should be one (as in "there should be already some sort of coordination", but also as in "the two databases MUST talk to each other").
What I know is that whenever a duplicate is solved VIAF-side, a bot runs into Wikidata and removes the deleted record. How often this happens, I don't know.
My suggestion is to NOT remove those duplicates, since they are all legit (for the time being). This does not solves the 8242 "single value violations" problem[1] we have at the moment, but we might want to talk to OCLC about this. If you need help, or if I can be of help, please let me know.
[1] https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violation...
As I understand it, VIAF identities are not intended as one per person. Names change, and people use pseudonyms. Each variation used in an included authority record should make its way to a VIAF ID. One will be the preferred form and the others should eventually link to that one as over time OCLC iterates the refinements to the database.
Corrections in specific cases can be handled via https://en.wikipedia.org/wiki/Wikipedia:VIAF/errors if nothing better is available.
B.T.W. is there still a Wikimedian-in-residence at OCLC? It used to be Maximillian Klein, I believe.
LeadSongDog
On Dec 12, 2017, at 5:49 AM, Brill Lyle wp.brilllyle@gmail.com wrote:
Hi all,
I understand this is probably outside the scope of Wikidata-l, but I am looking for advice in what I am noticing is a somewhat re-occuring issue with VIAF identifiers.
For a lot of items, I am finding 2 or more VIAF numbers for BLP subjects.
Recently I've found up to 4 VIAF numbers:
Is there coordination between Wikidata and VIAF that is automated and fixes things like this.
I don't want to remove some of these VIAF numbers as they are tied to "legit" authority bodies. I understand there is often a "main" VIAF number...
Thanks in advance for any assistance and/or advice with this.
- Erika
Erika Herzog Wikipedia User:BrillLyle _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
LeadSongDog is correct. Multiple VIAF identities can be attached. BTW, we have a "named as" "credited as" property for this already https://www.wikidata.org/wiki/Property:P1810
-Thad +ThadGuidry https://plus.google.com/+ThadGuidry