Hi all,

the GlobalFactSync project [1] will start tomorrow and we have established a good team to deal with many of these issues.

[1] https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSyncRE

As part of the project we will:

- work on syncing Wikidata and Wikipedias Infoboxes across languages and also try to sync references from infoboxes: 

-- https://global.dbpedia.org/?s=https%3A%2F%2Fglobal.dbpedia.org%2Fid%2F4k5rp&p=http%3A%2F%2Fdbpedia.org%2Fontology%2FbibsysId&src=general

-- http://infoboxes.net/en/Giuseppe%20Garibaldi

- we had a comment to sync up with Musicbrainz  as well https://meta.wikimedia.org/wiki/Grants_talk:Project/DBpedia/GlobalFactSyncRE#Interfacing_with_Wikidata's_data_quality_issues_in_certain_areas

- Note that there will be 10 concrete Sync targets to make concrete progress. We hope for more suggestions.


As part of DBpedias development, we already have a way to query all IDs and assign a global one:

https://global.dbpedia.org/same-thing/lookup/?uri=http://dbpedia.org/resource/Giuseppe_Garibaldi

We are also devising a strategy to manage all links centrally and give feedback to sources.

There is also an engine to compare all values: https://svn.aksw.org/papers/2019/ISWC_FlexiFusion/public.pdf


That said, these are all early prototypes and usability is lacking, but as proof-of-concept they are quite promising. Apache Spark provides scalability.

All the best,

Sebastian





On 31.05.19 14:23, Thomas Douillard wrote:


That's by design, since identifiers on Wikidata are not some kind of
top-down process where ever single actor's responsibility is defined
from the beginning.

This doesn't preclude good things from happening, as we've seen with LOC:
https://blogs.loc.gov/thesignal/2019/05/integrating-wikidata-at-the-library-of-congress/

I know that, but indeed that does not preclude deeper collaboration and synchronisation of Wikidata datas and other resources, and that’s the status of those potential collaboration and their maturity/workflow I’m interested in. Liking is useful of course but things happens also on Wikidata like data curation, duplicates entry identification on the external database working with their identifiers if it occurs that the same Wikidata item has two identifiers (for example : https://www.wikidata.org/wiki/Q539#P1015 has two values at the post time). I’m interested to know if this potential is exploited by some workflow authority control organisation by a periodic review/report of comparison of their datas against Wikidata community differences. And conversely if we imported datas from a datasource, if the changes made on the original live data source are reflected on the Wikidata datas.

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org