Hi everyone,
As a data quality addict, I've been investigating the coverage of external identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems that even the second most used ID (VIAF) only covers *25%* of people items circa. Then, there is a long tail of IDs that are barely used at all.
So here is my question: *which external identifiers deserve an effort to achieve exhaustive coverage?*
Looking forward to your valuable feedback. Cheers,
Marco
[1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select datatype" set to "ExternalId", "Used for class" set to "human Q5" [2] total people: http://tinyurl.com/ybvcm5uw [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7
I guess this question for me is how do we do this in practice? How do we make sure Wikidata stays up to date/synced with external databases we think are important?
On 7 September 2017 at 20:51, Marco Fossati fossati@spaziodati.eu wrote:
Hi everyone,
As a data quality addict, I've been investigating the coverage of external identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems that even the second most used ID (VIAF) only covers *25%* of people items circa. Then, there is a long tail of IDs that are barely used at all.
So here is my question: *which external identifiers deserve an effort to achieve exhaustive coverage?*
Looking forward to your valuable feedback. Cheers,
Marco
[1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select datatype" set to "ExternalId", "Used for class" set to "human Q5" [2] total people: http://tinyurl.com/ybvcm5uw [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 7 September 2017 at 20:16, john cummings mrjohncummings@gmail.com wrote:
I guess this question for me is how do we do this in practice? How do we make sure Wikidata stays up to date/synced with external databases we think are important?
At least part of that solution has to be to get the maintainers of those external databases involved - we have an API, and they can use it to update items, programmatically, if they store Wikidata IDs in their data (or if we have a third identifier in common).
OCLC (who manage VIAF) are doing this, as are Quora (not just for people), and others are beginning to show interest,
Digital signing of statements, when it is enabled, will encourage them, by demonstrating that their contributions will be respected and valued.
Hi Marco,
I guess this depends what you mean by "exhaustive". Exhaustive in that every Wikidata item has ID X, or exhaustive in that we have every instance of ID X in Wikidata?
The first is probably not going to happen, as the vast majority of external identifiers have a defined scope for what they identify. Some are pretty broad - VIAF is essentially "everyone who exists in a library catalogue as an author or subject" - but still have a limit. We're never really going to reach a situation where there is a single identifier type that covers everyone, unless we're linking across to another Wikidata-type comprehensive knowledgebase, and even then we'd need to ensure we're in a position where they already cover everything in Wikidata.
The second can (and has) been done - the largest one I know of offhand for people is the Oxford DNB (60k items) but for non-people we have complete coverage of eg Swedish district codes, P1841 (160k items). It's a bit of a slog to get these completed and then maintained, since the last 5-10% tend to be more challenging complicated cases, but one or two determined people can make it happen. And of course it's not appropriate for many identifiers, as they may issue IDs for things that we don't intend to have in Wikidata, so we will never completely cover them.
I should quickly plug the "expected completeness" property which is really useful for identifiers - P2429 - as this can quickly show whether something is a) completely on Wikidata; b) not complete yet but eventually might be; or c) probably never will be. Not very widely rolled out yet, though...
Andrew.
On 7 September 2017 at 19:51, Marco Fossati fossati@spaziodati.eu wrote:
Hi everyone,
As a data quality addict, I've been investigating the coverage of external identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems that even the second most used ID (VIAF) only covers *25%* of people items circa. Then, there is a long tail of IDs that are barely used at all.
So here is my question: *which external identifiers deserve an effort to achieve exhaustive coverage?*
Looking forward to your valuable feedback. Cheers,
Marco
[1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select datatype" set to "ExternalId", "Used for class" set to "human Q5" [2] total people: http://tinyurl.com/ybvcm5uw [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
As a basic rule for "which external identifiers are worth covering", I would begin with any national identifiers we have for people (politicians, artists, writers, theologians, scientists, etc), then national identifiers for organizations (government-related, GNP-related businesses, nonprofits, educational institutions, etc), then national identifiers for places (census-defined population centers, battle-scenes, etc)
In my opnion, the question should not be "which identifier has the most coverage" but "which items have the most identifiers"
On Thu, Sep 7, 2017 at 9:26 PM, Andrew Gray andrew@generalist.org.uk wrote:
Hi Marco,
I guess this depends what you mean by "exhaustive". Exhaustive in that every Wikidata item has ID X, or exhaustive in that we have every instance of ID X in Wikidata?
The first is probably not going to happen, as the vast majority of external identifiers have a defined scope for what they identify. Some are pretty broad - VIAF is essentially "everyone who exists in a library catalogue as an author or subject" - but still have a limit. We're never really going to reach a situation where there is a single identifier type that covers everyone, unless we're linking across to another Wikidata-type comprehensive knowledgebase, and even then we'd need to ensure we're in a position where they already cover everything in Wikidata.
The second can (and has) been done - the largest one I know of offhand for people is the Oxford DNB (60k items) but for non-people we have complete coverage of eg Swedish district codes, P1841 (160k items). It's a bit of a slog to get these completed and then maintained, since the last 5-10% tend to be more challenging complicated cases, but one or two determined people can make it happen. And of course it's not appropriate for many identifiers, as they may issue IDs for things that we don't intend to have in Wikidata, so we will never completely cover them.
I should quickly plug the "expected completeness" property which is really useful for identifiers - P2429 - as this can quickly show whether something is a) completely on Wikidata; b) not complete yet but eventually might be; or c) probably never will be. Not very widely rolled out yet, though...
Andrew.
On 7 September 2017 at 19:51, Marco Fossati fossati@spaziodati.eu wrote:
Hi everyone,
As a data quality addict, I've been investigating the coverage of
external
identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems
that
even the second most used ID (VIAF) only covers *25%* of people items
circa.
Then, there is a long tail of IDs that are barely used at all.
So here is my question: *which external identifiers deserve an effort to achieve exhaustive coverage?*
Looking forward to your valuable feedback. Cheers,
Marco
[1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select datatype" set to "ExternalId", "Used for class" set to "human Q5" [2] total people: http://tinyurl.com/ybvcm5uw [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
- Andrew Gray andrew@generalist.org.uk
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Is anyone working on an "auto-resolve" bot? If you have VIAF (but nothing else), you can resolve other identifiers via the VIAF site; similarly, if you have only GND, you could try to reverse-lookup VIAF.
I think a list of items that have zero external identifiers, ordered by "importance" (incoming wikidata links, number of statements etc) would also be helpful.
On Fri, Sep 8, 2017 at 10:52 AM Jane Darnell jane023@gmail.com wrote:
As a basic rule for "which external identifiers are worth covering", I would begin with any national identifiers we have for people (politicians, artists, writers, theologians, scientists, etc), then national identifiers for organizations (government-related, GNP-related businesses, nonprofits, educational institutions, etc), then national identifiers for places (census-defined population centers, battle-scenes, etc)
In my opnion, the question should not be "which identifier has the most coverage" but "which items have the most identifiers"
On Thu, Sep 7, 2017 at 9:26 PM, Andrew Gray andrew@generalist.org.uk wrote:
Hi Marco,
I guess this depends what you mean by "exhaustive". Exhaustive in that every Wikidata item has ID X, or exhaustive in that we have every instance of ID X in Wikidata?
The first is probably not going to happen, as the vast majority of external identifiers have a defined scope for what they identify. Some are pretty broad - VIAF is essentially "everyone who exists in a library catalogue as an author or subject" - but still have a limit. We're never really going to reach a situation where there is a single identifier type that covers everyone, unless we're linking across to another Wikidata-type comprehensive knowledgebase, and even then we'd need to ensure we're in a position where they already cover everything in Wikidata.
The second can (and has) been done - the largest one I know of offhand for people is the Oxford DNB (60k items) but for non-people we have complete coverage of eg Swedish district codes, P1841 (160k items). It's a bit of a slog to get these completed and then maintained, since the last 5-10% tend to be more challenging complicated cases, but one or two determined people can make it happen. And of course it's not appropriate for many identifiers, as they may issue IDs for things that we don't intend to have in Wikidata, so we will never completely cover them.
I should quickly plug the "expected completeness" property which is really useful for identifiers - P2429 - as this can quickly show whether something is a) completely on Wikidata; b) not complete yet but eventually might be; or c) probably never will be. Not very widely rolled out yet, though...
Andrew.
On 7 September 2017 at 19:51, Marco Fossati fossati@spaziodati.eu wrote:
Hi everyone,
As a data quality addict, I've been investigating the coverage of
external
identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems
that
even the second most used ID (VIAF) only covers *25%* of people items
circa.
Then, there is a long tail of IDs that are barely used at all.
So here is my question: *which external identifiers deserve an effort to achieve exhaustive coverage?*
Looking forward to your valuable feedback. Cheers,
Marco
[1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select datatype" set to "ExternalId", "Used for class" set to "human Q5" [2] total people: http://tinyurl.com/ybvcm5uw [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
- Andrew Gray andrew@generalist.org.uk
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
In general, I think it would be great to store inside Wikidata the graph of relations between identifiers. Something like:
VIAF linksTo ISNI VIAF linksTo GND … GRID linksTo ISNI arXiv linksTo DOI
Last time I looked, there was no simple way to do that. So for WikiProject Universities we have used a manual approach: https://www.wikidata.org/wiki/Wikidata:WikiProject_Universities/External_dat...
Antonin
On 08/09/2017 11:01, Magnus Manske wrote:
Is anyone working on an "auto-resolve" bot? If you have VIAF (but nothing else), you can resolve other identifiers via the VIAF site; similarly, if you have only GND, you could try to reverse-lookup VIAF.
I think a list of items that have zero external identifiers, ordered by "importance" (incoming wikidata links, number of statements etc) would also be helpful.
On Fri, Sep 8, 2017 at 10:52 AM Jane Darnell <jane023@gmail.com mailto:jane023@gmail.com> wrote:
As a basic rule for "which external identifiers are worth covering", I would begin with any national identifiers we have for people (politicians, artists, writers, theologians, scientists, etc), then national identifiers for organizations (government-related, GNP-related businesses, nonprofits, educational institutions, etc), then national identifiers for places (census-defined population centers, battle-scenes, etc) In my opnion, the question should not be "which identifier has the most coverage" but "which items have the most identifiers" On Thu, Sep 7, 2017 at 9:26 PM, Andrew Gray <andrew@generalist.org.uk <mailto:andrew@generalist.org.uk>> wrote: Hi Marco, I guess this depends what you mean by "exhaustive". Exhaustive in that every Wikidata item has ID X, or exhaustive in that we have every instance of ID X in Wikidata? The first is probably not going to happen, as the vast majority of external identifiers have a defined scope for what they identify. Some are pretty broad - VIAF is essentially "everyone who exists in a library catalogue as an author or subject" - but still have a limit. We're never really going to reach a situation where there is a single identifier type that covers everyone, unless we're linking across to another Wikidata-type comprehensive knowledgebase, and even then we'd need to ensure we're in a position where they already cover everything in Wikidata. The second can (and has) been done - the largest one I know of offhand for people is the Oxford DNB (60k items) but for non-people we have complete coverage of eg Swedish district codes, P1841 (160k items). It's a bit of a slog to get these completed and then maintained, since the last 5-10% tend to be more challenging complicated cases, but one or two determined people can make it happen. And of course it's not appropriate for many identifiers, as they may issue IDs for things that we don't intend to have in Wikidata, so we will never completely cover them. I should quickly plug the "expected completeness" property which is really useful for identifiers - P2429 - as this can quickly show whether something is a) completely on Wikidata; b) not complete yet but eventually might be; or c) probably never will be. Not very widely rolled out yet, though... Andrew. On 7 September 2017 at 19:51, Marco Fossati <fossati@spaziodati.eu <mailto:fossati@spaziodati.eu>> wrote: > Hi everyone, > > As a data quality addict, I've been investigating the coverage of external > identifiers linked to Wikidata items about people. > > Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems that > even the second most used ID (VIAF) only covers *25%* of people items circa. > Then, there is a long tail of IDs that are barely used at all. > > So here is my question: > *which external identifiers deserve an effort to achieve exhaustive > coverage?* > > Looking forward to your valuable feedback. > Cheers, > > Marco > > [1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select > datatype" set to "ExternalId", "Used for class" set to "human Q5" > [2] total people: http://tinyurl.com/ybvcm5uw > [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7 > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikidata -- - Andrew Gray andrew@generalist.org.uk <mailto:andrew@generalist.org.uk> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Somewhat related to this discussion is the coli-conc project, which collects statistics about KOS-type (thesaurus, authority file etc.) identifier links in Wikidata:
http://coli-conc.gbv.de/concordances/wikidata/
You can also find statistics about indirect mappings, from one KOS via Wikidata to another KOS.
-Osma
Magnus Manske kirjoitti 08.09.2017 klo 13:01:
Is anyone working on an "auto-resolve" bot? If you have VIAF (but nothing else), you can resolve other identifiers via the VIAF site; similarly, if you have only GND, you could try to reverse-lookup VIAF.
I think a list of items that have zero external identifiers, ordered by "importance" (incoming wikidata links, number of statements etc) would also be helpful.
On Fri, Sep 8, 2017 at 10:52 AM Jane Darnell <jane023@gmail.com mailto:jane023@gmail.com> wrote:
As a basic rule for "which external identifiers are worth covering", I would begin with any national identifiers we have for people (politicians, artists, writers, theologians, scientists, etc), then national identifiers for organizations (government-related, GNP-related businesses, nonprofits, educational institutions, etc), then national identifiers for places (census-defined population centers, battle-scenes, etc) In my opnion, the question should not be "which identifier has the most coverage" but "which items have the most identifiers" On Thu, Sep 7, 2017 at 9:26 PM, Andrew Gray <andrew@generalist.org.uk <mailto:andrew@generalist.org.uk>> wrote: Hi Marco, I guess this depends what you mean by "exhaustive". Exhaustive in that every Wikidata item has ID X, or exhaustive in that we have every instance of ID X in Wikidata? The first is probably not going to happen, as the vast majority of external identifiers have a defined scope for what they identify. Some are pretty broad - VIAF is essentially "everyone who exists in a library catalogue as an author or subject" - but still have a limit. We're never really going to reach a situation where there is a single identifier type that covers everyone, unless we're linking across to another Wikidata-type comprehensive knowledgebase, and even then we'd need to ensure we're in a position where they already cover everything in Wikidata. The second can (and has) been done - the largest one I know of offhand for people is the Oxford DNB (60k items) but for non-people we have complete coverage of eg Swedish district codes, P1841 (160k items). It's a bit of a slog to get these completed and then maintained, since the last 5-10% tend to be more challenging complicated cases, but one or two determined people can make it happen. And of course it's not appropriate for many identifiers, as they may issue IDs for things that we don't intend to have in Wikidata, so we will never completely cover them. I should quickly plug the "expected completeness" property which is really useful for identifiers - P2429 - as this can quickly show whether something is a) completely on Wikidata; b) not complete yet but eventually might be; or c) probably never will be. Not very widely rolled out yet, though... Andrew. On 7 September 2017 at 19:51, Marco Fossati <fossati@spaziodati.eu <mailto:fossati@spaziodati.eu>> wrote: > Hi everyone, > > As a data quality addict, I've been investigating the coverage of external > identifiers linked to Wikidata items about people. > > Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems that > even the second most used ID (VIAF) only covers *25%* of people items circa. > Then, there is a long tail of IDs that are barely used at all. > > So here is my question: > *which external identifiers deserve an effort to achieve exhaustive > coverage?* > > Looking forward to your valuable feedback. > Cheers, > > Marco > > [1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select > datatype" set to "ExternalId", "Used for class" set to "human Q5" > [2] total people: http://tinyurl.com/ybvcm5uw > [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7 > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikidata -- - Andrew Gray andrew@generalist.org.uk <mailto:andrew@generalist.org.uk> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Andrew, all,
In my eyes, a large incentive for the maintainers of external databases - as I am one for the ZBW German National Library for Economics - is the data they can earn: not only in terms of property values and attached Wikipedia pages, but also in terms identifiers and links to other vocabularies.
This can reach up to the point where Wikidata replaces a custom database for identifier mappings. We approached that by moving a mapping of GND and RePEc (P2428) identifiers to Wikidata. Still, that were only 3100 of 460,000 GND IDs and 50,000 RePEc IDs in our EconBiz portal alone, so it's still very sparse - but an improvement. (details see https://hackmd.io/p/S1YmXWC0e). Adding, e.g., the rest of the "most important economists" from RePEc as well as GND is very tempting, as it will extend the mapping with relatively low efforts.
For vocabularies limited in size, such as STW Thesaurus for Economics, a complete mapping can be achieved (if relations beyond equivalence are available - see https://www.wikidata.org/wiki/Wikidata:Property_proposal/mapping_relation_ty...). The incentive for that is even higher, because it saves the owner of the vocabulary all cost of maintaining possibly multiple mappings to third-party-vocabularies.
So I think that embracing and extending the Wikidata's role as an *universal linking hub* benefits everybody, and will improve total coverage largely, because it offers incentives to communities not involved in Wikidata before.
Cheers, Joachim
PS. Thanks for the hint to P2429 - looks very useful!
-----Ursprüngliche Nachricht----- Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Andrew Gray Gesendet: Donnerstag, 7. September 2017 21:26 An: Discussion list for the Wikidata project. Betreff: Re: [Wikidata] Which external identifiers are worth covering?
Hi Marco,
I guess this depends what you mean by "exhaustive". Exhaustive in that every Wikidata item has ID X, or exhaustive in that we have every instance of ID X in Wikidata?
The first is probably not going to happen, as the vast majority of external identifiers have a defined scope for what they identify. Some are pretty broad - VIAF is essentially "everyone who exists in a library catalogue as an author or subject" - but still have a limit. We're never really going to reach a situation where there is a single identifier type that covers everyone, unless we're linking across to another Wikidata-type comprehensive knowledgebase, and even then we'd need to ensure we're in a position where they already cover everything in Wikidata.
The second can (and has) been done - the largest one I know of offhand for people is the Oxford DNB (60k items) but for non-people we have complete coverage of eg Swedish district codes, P1841 (160k items). It's a bit of a slog to get these completed and then maintained, since the last 5- 10% tend to be more challenging complicated cases, but one or two determined people can make it happen. And of course it's not appropriate for many identifiers, as they may issue IDs for things that we don't intend to have in Wikidata, so we will never completely cover them.
I should quickly plug the "expected completeness" property which is really useful for identifiers - P2429 - as this can quickly show whether something is a) completely on Wikidata; b) not complete yet but eventually might be; or c) probably never will be. Not very widely rolled out yet, though...
Andrew.
On 7 September 2017 at 19:51, Marco Fossati fossati@spaziodati.eu wrote:
Hi everyone,
As a data quality addict, I've been investigating the coverage of external identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems that even the second most used ID (VIAF) only covers *25%* of people items
circa.
Then, there is a long tail of IDs that are barely used at all.
So here is my question: *which external identifiers deserve an effort to achieve exhaustive coverage?*
Looking forward to your valuable feedback. Cheers,
Marco
[1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select datatype" set to "ExternalId", "Used for class" set to "human Q5" [2] total people: http://tinyurl.com/ybvcm5uw [3] people with a VIAF link: http://tinyurl.com/ya6dnpr7
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
- Andrew Gray andrew@generalist.org.uk
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 7 September 2017 at 19:51, Marco Fossati fossati@spaziodati.eu wrote:
external identifiers linked to Wikidata items about people.
I'll take this as an invitation to remind everyone about ORCID iDs ;-)
See:
https://www.wikidata.org/wiki/Wikidata:ORCID
and:
https://en.wikipedia.org/wiki/Wikipedia:ORCID
Not only is this an identifier for people who often have no other ID for us to use, but it is one that they can register for, and manage, themselves.
So we can help ourselves, by encouraging the people we write about to register for, and use, one. This is especially true (but not limited to) all the authors of scientific papers which we are currently adding to Wikidata.
However, although an ORCID iD is free to register and use, and ORCID is an open source project, run by a non-profit foundation, there has been some resistance, from a few noisy individuals, to us doing this. The German Wikipedia community has so far, refused to include ORCID iDs in its "Normdaten" ("Authority control") template.
Hi Marco,
On 07-09-17 20:51, Marco Fossati wrote:
Hi everyone,
As a data quality addict, I've been investigating the coverage of external identifiers linked to Wikidata items about people.
Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems that even the second most used ID (VIAF) only covers *25%* of people items circa. Then, there is a long tail of IDs that are barely used at all.
So here is my question: *which external identifiers deserve an effort to achieve exhaustive coverage?*
I've been doing this for painters. See https://www.wikidata.org/wiki/Wikidata:WikiProject_sum_of_all_paintings/Crea... and https://www.wikidata.org/wiki/Wikidata:WikiProject_sum_of_all_paintings/Crea... .
Maarten