Hoi,
Given that Wikidata has identifiers to many external sources the challenge of reconciliation is often less of a challenge for crowds and less of a challenge than it needs to be. A few examples; the OCLC maintains two distinct identifiers; VIAF and ISNI.  They are both actively maintained. When we include VIAF numbers in Wikidata, there will be instances where the identifiers become redirects. The same is true for ISNI. When we have the latest VIAF numbers, the ISNI numbers are highly likely to be correct. (better than 95% - the minimum requirements for imports at ISNI)..

When we share our identifiers regularly, we will learn about redirects and gain the direct links. We shared our identiers and VIAF identifiers with the Open Library. They now include them and in return we received a file that helped us depuplicate our Open Library identifiers and replace the redirects. What is infuriating is that there are Open Library identifiers hidden in the Freebase data. They cannot be exported, we can not send them to OL for processing and import them in Wikidata. We do a subpar job as a consequence.

Another project where we will  gain information from multiple sources is the Biodiversity Heritage Library. We may gain links through their collaboration with the Internet Archive and the OCLC. This will reduce the chances for the introduction of duplicates at our end because of shared identifiers. I will also reduce the amount or people we have to process before they are included in Wikidata. It will allow for both OCLC, BHL and IA to learn of identifiers as we have them allowing for subsequent improvement is quality in the future for all of us.

So in my opinion we should agressively share identifiers, collaborate and seek the redirects and replace them and become more and more a focal point for links between resources.
Thanks,
     GerardM

On 8 August 2017 at 11:13, Marco Fossati <fossati@spaziodati.eu> wrote:
Hi Antonin,

On 8/7/17 20:36, Antonin Delpeuch (lists) wrote:
Does anybody know an alternative to CrowdFlower that can be used for
free with volunteer workers?
There you go: https://crowdcrafting.org/
Hope this helps you keep up with your great work on openrefine.

I believe entity reconciliation is one of the most challenging tasks that keep third-party data providers away from imports to Wikidata.
Cheers,

Marco


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata