Hi Osma,

just a few remarks:

* If you want to "seed" Mix'n'match with third-party/indirect IDs already in Wikidata, best to not create the catalog yourself, but mail me the data instead

* If you want "YSO places" in Wikidata, we will need a new property for that, unless the P2347 formatter URL would redirect automatically to "/yso-paikat/"

* You can create a Mix'n'match catalog before there is a property, and link them up later. The catalog will then synchronize

Cheers,
Magnus

On Tue, Jun 6, 2017 at 11:19 AM Osma Suominen <osma.suominen@helsinki.fi> wrote:
Hi Wikidatans,

After several delays we are finally starting to think seriously about
mapping the General Finnish Ontology YSO [1] to Wikidata. A "YSO ID"
property (https://www.wikidata.org/wiki/Property:P2347) was added to
Wikidata some time ago, but it has been used only a few times so far.

Recently some 6000 places have been added to "YSO Places" [2], a new
extension of YSO, which was generated from place names in YSA and
Allärs, our earlier subject indexing vocabularies. It would probably
make sense to map these places to Wikidata, in addition to the general
concepts in YSO. We have already manually added a few links from YSA/YSO
places to Wikidata for newly added places, but this approach does not
scale if we want to link the thousands of existing places.

We also have some indirect sources of YSO/Wikidata mappings:

1. YSO is mapped to LCSH, and Wikidata also to LCSH (using P244, LC/NACO
Authority File ID). I digged a bit into both sets of mappings and found
that approximately 1200 YSO-Wikidata links could be generated from the
intersection of these mappings.

2. The Finnish broadcasting company Yle has also created some mappings
between KOKO (which includes YSO) and Wikidata. Last time I looked at
those, we could generate at least 5000 YSO-Wikidata links from them.
Probably more nowadays.


Of course, indirect mappings are a bit dangerous. It's possible that
there are some differences in meaning, especially with LCSH which has a
very different structure (and cultural context) than YSO. Nevertheless I
think these could be a good starting point, especially if a tool such as
Mix'n'Match could be used to verify them.

Now my question is, given that we already have or could easily generate
thousands of Wikidata-YSO mappings, but the rest would still have to be
semi-automatically linked using Mix'n'Match, what would be a good way to
approach this? Does Mix'n'Match look at existing statements (in this
case YSO ID / P2347) in Wikidata when you load a new catalog, or ignore
them?

I can think of at least these approaches:

1. First import the indirect mappings we already have to Wikidata as
P2347 statements, then create a Mix'n'Match catalog with the remaining
YSO concepts. The indirect mappings would have to be verified separately.

2. First import the indirect mappings we already have to Wikidata as
P2347 statements, then create a Mix'n'Match catalog with ALL the YSO
concepts, including the ones for which we already have imported a
mapping. Use Mix'n'Match to verify the indirect mappings.

3. Forget about the existing mappings and just create a Mix'n'Match
catalog with all the YSO concepts.

Any advice?

Thanks,

-Osma

[1] http://finto.fi/yso/

[2] http://finto.fi/yso-paikat/

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suominen@helsinki.fi
http://www.nationallibrary.fi

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata