Hi Federico,
Maarten Dammers, 14/07/19 15:04:Maybe one of the locals has more information.
Several, I think. The most significant I remember were from Sweden and Finland.Any pointers?
Maybe http://www.ksamsok.se/in-english/ , but I know more about http://data.nationallibrary.fi/bib/sparql and related.
Anyway, their platform (Lodview) is quite nice. We should also add links to things like http://dati.beniculturali.it/iccd/schede/resource/GeographicalFeature/Comune_di_MONZAMBANO
I guess it wouldn't harm. Matching municipalities is often a major pain. The amount of "open data" which is released with usable references to municipalities is negligible, usually you end up manually matching names or codes in free text form in some CSV.
I took https://www.wikidata.org/wiki/Q42327 and
http://dati.beniculturali.it/iccd/schede/resource/GeographicalFeature/Comune_di_MONZAMBANO
to compare them:
* We have ISTAT ID set to 020036, they have owl:sameAs
http://dati.isprambiente.it/id/place/20036 which has haCodIstat
set to 020036 and links back to
Wikidata (and a lot more)
* We have Italian cadastre code F705, they have owl:sameAs
http://spcdata.digitpa.gov.it/Comune/F705
All sorts of cross links exist and we should be able to add
quite a few missing links.
Looks to me that http://dati.isprambiente.it/sparql/ is under a
free license ( http://dati.isprambiente.it/id/place/20036 give
cc-by 4.0 international at the bottom) and I don't see it at the
report. I guess this one is good to go to add to
https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input#Incoming_nominations
?
The record is actually linked to http://dati.beniculturali.it/iccd/schede/resource/Site/Sito_di_S010537_Chiesa_Parrocchiale which links to http://dati.beniculturali.it/lodview/iccd/schede/resource/Address/Indirizzo_della_sede_di_S010537_Chiesa_Parrocchiale and http://dati.beniculturali.it/lodview/iccd/schede/resource/GeographicalFeature/Comune_di_MONZAMBANO giving a lot more context. Here you can also see that the same site seems to be listed twice ( http://dati.beniculturali.it/lodview/iccd/schede/resource/Site/Sito_di_S010537_Chiesa_Parrocchiale and http://dati.beniculturali.it/lodview/iccd/schede/resource/Site/Sito_di_S010538_Chiesa_Parrocchiale ) with something happening around 1935. Not sure which one we shoud link to.
and http://dati.beniculturali.it/iccd/schede/resource/uod/S010537 .
Importing the entire of the ontology itself can be trickier. More work on this side has been done by ICCU and ICCD (the ministry): usually it takes them a few years of manual work to connect an ontology.
For our import it was more important to handle the objects which had very little (structured) information. The more detailed descriptions are usually sparsely used (in this case you linked, only by one province which was cataloguing first world war damages? I don't know).
These federated queries seem to break a lot. Looks my example started timing out in April.....
With the federation in place, it's possible to set up automated reports to find mismatches between the data. See for example the report on https://www.wikidata.org/wiki/Property_talk:P1006/Mismatches . Obvious report for this domain would be monuments in the beniculturali database, but not on Wikidata. Or do you already have something in place?
We used the SPARQL queries listed at the end of the page: <https://www.wikidata.org/?curid=30576438#Reports_for_cleanup_and_data_improvement>. I don't remember if federated queries were fast enough at the time to be usable, I only remember using them for small subsets of the data.
As far as I can see the bot is coded to get all the data at once and then see what needs doing. It doesn't attempt to get incremental updates with federated queries.
<https://github.com/synapta/wikidata-mibact-luoghi-cultura/blob/master/bot-mibact-to-wikidata/queries.js>
It probably produced items like
https://www.wikidata.org/wiki/Q55162430 ? Based on the link quite
a bit more info could be added.
Maybe this (linked open data and Wiki Loves Monuments) is
something fun to work on during the pre-conference of Wikimania.
We should probably get some of the missing SPARQL endpoints
whitelisted before so we won't be slowed down by that.
Maarten