Hi,
We're more than halfway through mapping YSO places to Wikidata. Most of the remaining are places that don't exist in Wikidata, and adding them is quite labor-intensive so we will have to consider our strategy.
Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example Q36 Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata the corresponding YSO ID property doesn't actually exist for the entity. I checked the change history of the Q36 entity and couldn't find anything relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some kind?
Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set it again, and now it is properly stored in Wikidata.
Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I suspect that this only affects a small number of entities.
Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there manually?
Thanks, Osma
There is a sync function per catalog, in your case:
https://tools.wmflabs.org/mix-n-match/#/sync/473
This is also linked from the "Action" drop-down menu for the respective catalog.
I am running it now, so should be low numbers (14 not on Wikidata when I got there).
On Mon, Aug 21, 2017 at 10:07 AM Osma Suominen osma.suominen@helsinki.fi wrote:
Hi,
We're more than halfway through mapping YSO places to Wikidata. Most of the remaining are places that don't exist in Wikidata, and adding them is quite labor-intensive so we will have to consider our strategy.
Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example Q36 Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata the corresponding YSO ID property doesn't actually exist for the entity. I checked the change history of the Q36 entity and couldn't find anything relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some kind?
Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set it again, and now it is properly stored in Wikidata.
Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I suspect that this only affects a small number of entities.
Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there manually?
Thanks, Osma
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 <+358%2050%203199529> osma.suominen@helsinki.fi http://www.nationallibrary.fi _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Thanks Magnus, I hadn't noticed this sync function.
When I open the sync page, it says "24 connections on Wikidata, but not here". I clicked on "Update Mix'n'match", and after a while it says "Done", but when I refresh the page, it still says there are 24 connections. When I started it said 27, then after one update it was 26, then 24, but now I can't get it any lower. I wonder what's going on?
It also says "3 connections here, but not on Wikidata". When I click on Update Wikidata, I get to the Quick Statements tool, with three statements that I suppose would add the missing connections. But when I click on "Do it" nothing happens. I also checked the 3 generated YSO ID statements, but they were already in Wikidata, added several days ago!
So something seems fishy here, the numbers don't quite add up and apparently not everything can be synced between Mix'n'match and Wikidata. But I think the important information got synced, so all is (relatively) well.
Part of the reason may be that there are a few YSO concepts that are linked to from Wikidata entities using the same YSO ID property, but these are not included in the "YSO Places" catalog in Mix'n'match because they are not places. I suspect at least some of those "24 connections on Wikidata, but not here" may be like that - they don't match any of the IDs in the YSO Places catalog.
-Osma
Magnus Manske kirjoitti 21.08.2017 klo 12:25:
There is a sync function per catalog, in your case:
https://tools.wmflabs.org/mix-n-match/#/sync/473
This is also linked from the "Action" drop-down menu for the respective catalog.
I am running it now, so should be low numbers (14 not on Wikidata when I got there).
On Mon, Aug 21, 2017 at 10:07 AM Osma Suominen <osma.suominen@helsinki.fi mailto:osma.suominen@helsinki.fi> wrote:
Hi, We're more than halfway through mapping YSO places to Wikidata. Most of the remaining are places that don't exist in Wikidata, and adding them is quite labor-intensive so we will have to consider our strategy. Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example Q36 Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata the corresponding YSO ID property doesn't actually exist for the entity. I checked the change history of the Q36 entity and couldn't find anything relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some kind? Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set it again, and now it is properly stored in Wikidata. Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I suspect that this only affects a small number of entities. Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there manually? Thanks, Osma -- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 <tel:+358%2050%203199529> osma.suominen@helsinki.fi <mailto:osma.suominen@helsinki.fi> http://www.nationallibrary.fi _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Mon, Aug 21, 2017 at 12:37 PM Osma Suominen osma.suominen@helsinki.fi wrote:
Thanks Magnus, I hadn't noticed this sync function.
When I open the sync page, it says "24 connections on Wikidata, but not here". I clicked on "Update Mix'n'match", and after a while it says "Done", but when I refresh the page, it still says there are 24 connections. When I started it said 27, then after one update it was 26, then 24, but now I can't get it any lower. I wonder what's going on?
There may be items with YSO IDs on Wikidata that are not in Mix'n'match.
It also says "3 connections here, but not on Wikidata". When I click on Update Wikidata, I get to the Quick Statements tool, with three statements that I suppose would add the missing connections. But when I click on "Do it" nothing happens. I also checked the 3 generated YSO ID statements, but they were already in Wikidata, added several days ago!
When you say "nothing happens", did you scroll down? There's a bug that shows the manual again, so the edits are going on "below the fold".
The "diff" functionality for sync uses the Wikidata SPARQL interface, which might lack the odd statement. Or the Q item in Mix'n'match is now deleted, or a redirect, in which case editing won't work.
So something seems fishy here, the numbers don't quite add up and apparently not everything can be synced between Mix'n'match and Wikidata. But I think the important information got synced, so all is (relatively) well.
Automation can only cope with so much human activity :-(
Part of the reason may be that there are a few YSO concepts that are linked to from Wikidata entities using the same YSO ID property, but these are not included in the "YSO Places" catalog in Mix'n'match because they are not places. I suspect at least some of those "24 connections on Wikidata, but not here" may be like that - they don't match any of the IDs in the YSO Places catalog.
-Osma
Magnus Manske kirjoitti 21.08.2017 klo 12:25:
There is a sync function per catalog, in your case:
https://tools.wmflabs.org/mix-n-match/#/sync/473
This is also linked from the "Action" drop-down menu for the respective catalog.
I am running it now, so should be low numbers (14 not on Wikidata when I got there).
On Mon, Aug 21, 2017 at 10:07 AM Osma Suominen <osma.suominen@helsinki.fi mailto:osma.suominen@helsinki.fi> wrote:
Hi, We're more than halfway through mapping YSO places to Wikidata. Most
of
the remaining are places that don't exist in Wikidata, and adding
them
is quite labor-intensive so we will have to consider our strategy. Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example
Q36
Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata
the
corresponding YSO ID property doesn't actually exist for the entity.
I
checked the change history of the Q36 entity and couldn't find
anything
relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some kind? Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set
it
again, and now it is properly stored in Wikidata. Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings.
So I
suspect that this only affects a small number of entities. Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there
manually?
Thanks, Osma -- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 <+358%2050%203199529> <tel:+358%2050%203199529> osma.suominen@helsinki.fi <mailto:osma.suominen@helsinki.fi> http://www.nationallibrary.fi _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 <+358%2050%203199529> osma.suominen@helsinki.fi http://www.nationallibrary.fi
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Magnus Manske kirjoitti 21.08.2017 klo 15:03:
There may be items with YSO IDs on Wikidata that are not in Mix'n'match.
Right.
When you say "nothing happens", did you scroll down? There's a bug that shows the manual again, so the edits are going on "below the fold".
Ah, thanks, didn't notice that. I get this:
Processing Q11850007 (Q11850007 P2347 "112658") Processing Q11884353 (Q11884353 P2347 "148037") Processing Q11892087 (Q11892087 P2347 "116422")
All done!.
The "diff" functionality for sync uses the Wikidata SPARQL interface, which might lack the odd statement. Or the Q item in Mix'n'match is now deleted, or a redirect, in which case editing won't work.
This SPARQL query in WDQS:
SELECT ?target WHERE { wd:Q11850007 wdt:P2347 ?target. }
returns the expected value ("112658").
Despite this, the sync tool claims that 3 connections are missing from Wikidata, and when I click on "Update Wikidata", I get to Quick Statements with this as the first generated statement.
-Osma
Hi Osma,
re. adding missing items, I've made good experiences with creating input files for Quickstatements2 (see https://github.com/zbw/repec-ras/blob/master/bin/create_missing_wikidata.pl). I've discussed how to best do this in the Wikidata Project Chat before, and received valuable advice. (https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2017/05#Source_s...)
Feel free to ask for further information, and all the best, Joachim
-----Ursprüngliche Nachricht----- Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Osma Suominen Gesendet: Montag, 21. August 2017 11:07 An: Discussion list for the Wikidata project. Betreff: [Wikidata] Some Mix'n'match mappings not stored in Wikidata?
Hi,
We're more than halfway through mapping YSO places to Wikidata. Most of the remaining are places that don't exist in Wikidata, and adding them is quite labor-intensive so we will have to consider our strategy.
Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example Q36 Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata the corresponding YSO ID property doesn't actually exist for the entity. I checked the change history of the Q36 entity and couldn't find anything relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some kind?
Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set it again, and now it is properly stored in Wikidata.
Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I suspect that this only affects a small number of entities.
Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there manually?
Thanks, Osma
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 osma.suominen@helsinki.fi http://www.nationallibrary.fi
Hi Joachim,
Thanks for this, indeed this could be a potential strategy for us to add some or all of the missing entities. The challenge is that we would need to be reasonably sure that the places we want to create actually don't exist in Wikidata, for example using an alternate spelling. You said in your question that "Of course we make sure that neither of the ids exist in WD so far", but how did you do that?
-Osma
Neubert, Joachim kirjoitti 21.08.2017 klo 12:36:
Hi Osma,
re. adding missing items, I've made good experiences with creating input files for Quickstatements2 (see https://github.com/zbw/repec-ras/blob/master/bin/create_missing_wikidata.pl). I've discussed how to best do this in the Wikidata Project Chat before, and received valuable advice. (https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2017/05#Source_s...)
Feel free to ask for further information, and all the best, Joachim
-----Ursprüngliche Nachricht----- Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Osma Suominen Gesendet: Montag, 21. August 2017 11:07 An: Discussion list for the Wikidata project. Betreff: [Wikidata] Some Mix'n'match mappings not stored in Wikidata?
Hi,
We're more than halfway through mapping YSO places to Wikidata. Most of the remaining are places that don't exist in Wikidata, and adding them is quite labor-intensive so we will have to consider our strategy.
Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example Q36 Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata the corresponding YSO ID property doesn't actually exist for the entity. I checked the change history of the Q36 entity and couldn't find anything relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some kind?
Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set it again, and now it is properly stored in Wikidata.
Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I suspect that this only affects a small number of entities.
Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there manually?
Thanks, Osma
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 osma.suominen@helsinki.fi http://www.nationallibrary.fi
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Osma,
The instrument we used to avoid duplicates was Mix-n-match. Even when something is not "automatically matched", often, on the details page (e.g., https://tools.wmflabs.org/mix-n-match/#/entry/22734337), possible matches come up.
That covers the case where a (partial) name is present somewhere in Wikidata or Wikipedia. Unfortunatly, I've not yet figured out how I could feed my own synonyms into Mix-n-match. Providing them in the description field helps for intellectual identification, but seems not to be used by the matching algorithm. Possibly, a separate "catalog" with permutated name variants from not-yet-matched entries could help, but I'm not sure if Magnus would encourage that, because it messes up the catalog list. Swedish and Finnish names for the same locations however could perhaps be a valid use case.
Anyway, with the 2,200 missing RePEc authors I decided at that point that the result was good enough, and created the not-matched entries. Less than a handful showed up later on as duplicates at some point (e.g., as automatically matched against GND). Of course, some will still linger hidden. But it is very easy to merge items in Wikidata, so I consider that as a much minor problem than it would be in library systems, where it is administrative and technically much more difficult to get rid of duplicates.
Cheers, Joachim (and sorry for the late response)
-----Ursprüngliche Nachricht----- Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Osma Suominen Gesendet: Montag, 21. August 2017 13:41 An: wikidata@lists.wikimedia.org Betreff: Re: [Wikidata] Some Mix'n'match mappings not stored in Wikidata?
Hi Joachim,
Thanks for this, indeed this could be a potential strategy for us to add some or all of the missing entities. The challenge is that we would need to be reasonably sure that the places we want to create actually don't exist in Wikidata, for example using an alternate spelling. You said in your question that "Of course we make sure that neither of the ids exist in WD so far", but how did you do that?
-Osma
Neubert, Joachim kirjoitti 21.08.2017 klo 12:36:
Hi Osma,
re. adding missing items, I've made good experiences with creating input files for Quickstatements2 (see https://github.com/zbw/repec-ras/blob/master/bin/create_missing_wikida ta.pl). I've discussed how to best do this in the Wikidata Project Chat before, and received valuable advice. (https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2017/05#S ource_statements_for_items_syntesized_from_authorities_-_recommendatio ns.3F)
Feel free to ask for further information, and all the best, Joachim
-----Ursprüngliche Nachricht----- Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Osma Suominen Gesendet: Montag, 21. August 2017 11:07 An: Discussion list for the Wikidata project. Betreff: [Wikidata] Some Mix'n'match mappings not stored in Wikidata?
Hi,
We're more than halfway through mapping YSO places to Wikidata. Most of the remaining are places that don't exist in Wikidata, and adding them is quite labor-intensive so we will have to consider our strategy.
Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example Q36 Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata the corresponding YSO ID property doesn't actually exist for the entity. I checked the change history of the Q36 entity and couldn't find anything relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some
kind?
Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set it again, and now it is properly stored in Wikidata.
Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I suspect that this only affects a small number of entities.
Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there manually?
Thanks, Osma
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 osma.suominen@helsinki.fi http://www.nationallibrary.fi
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 osma.suominen@helsinki.fi http://www.nationallibrary.fi
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Neubert, Joachim, 28/08/2017 14:45:
Anyway, with the 2,200 missing RePEc authors I decided at that point that the result was good enough, and created the not-matched entries. Less than a handful showed up later on as duplicates at some point (e.g., as automatically matched against GND). Of course, some will still linger hidden. But it is very easy to merge items in Wikidata, so I consider that as a much minor problem than it would be in library systems, where it is administrative and technically much more difficult to get rid of duplicates.
I agree, this is a sensible process. When the initial matching is done, the data and the work of further polishing needs to move to the wiki.
Nemo
Hi Joachim,
Thanks a lot, this is extremely valuable for us!
I'm not sure I trust the Mix'n'match algorithm enough to determine that the results are good enough - I would feel more comfortable if there was some additional confirmation that the leftover places really do not exist in Wikidata, for example after using alternate and/or Swedish language labels to find additional match candidates.
Mix'n'match also apparently doesn't distinguish between entities that were not matched because no candidates were found in Wikidata to match against, versus entities that were not mapped because there was more than one candidate available. At the moment we have a mix of both types of failed matches in the Unmatched category. It would probably be fairly safe to bulk-add the places that didn't match against anything, but I don't know how to extract that kind of list from Mix'n'match.
My current plan is to try to take the remaining, unmapped places and try to reconcile them using OpenRefine; if there are still no matches, then I can go ahead and add them to Wikidata, most likely using the Quick Statements tool which seems really convenient for this.
-Osma
Neubert, Joachim kirjoitti 28.08.2017 klo 14:45:
Hi Osma,
The instrument we used to avoid duplicates was Mix-n-match. Even when something is not "automatically matched", often, on the details page (e.g., https://tools.wmflabs.org/mix-n-match/#/entry/22734337), possible matches come up.
That covers the case where a (partial) name is present somewhere in Wikidata or Wikipedia. Unfortunatly, I've not yet figured out how I could feed my own synonyms into Mix-n-match. Providing them in the description field helps for intellectual identification, but seems not to be used by the matching algorithm. Possibly, a separate "catalog" with permutated name variants from not-yet-matched entries could help, but I'm not sure if Magnus would encourage that, because it messes up the catalog list. Swedish and Finnish names for the same locations however could perhaps be a valid use case.
Anyway, with the 2,200 missing RePEc authors I decided at that point that the result was good enough, and created the not-matched entries. Less than a handful showed up later on as duplicates at some point (e.g., as automatically matched against GND). Of course, some will still linger hidden. But it is very easy to merge items in Wikidata, so I consider that as a much minor problem than it would be in library systems, where it is administrative and technically much more difficult to get rid of duplicates.
Cheers, Joachim (and sorry for the late response)
-----Ursprüngliche Nachricht----- Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Osma Suominen Gesendet: Montag, 21. August 2017 13:41 An: wikidata@lists.wikimedia.org Betreff: Re: [Wikidata] Some Mix'n'match mappings not stored in Wikidata?
Hi Joachim,
Thanks for this, indeed this could be a potential strategy for us to add some or all of the missing entities. The challenge is that we would need to be reasonably sure that the places we want to create actually don't exist in Wikidata, for example using an alternate spelling. You said in your question that "Of course we make sure that neither of the ids exist in WD so far", but how did you do that?
-Osma
Neubert, Joachim kirjoitti 21.08.2017 klo 12:36:
Hi Osma,
re. adding missing items, I've made good experiences with creating input files for Quickstatements2 (see https://github.com/zbw/repec-ras/blob/master/bin/create_missing_wikida ta.pl). I've discussed how to best do this in the Wikidata Project Chat before, and received valuable advice. (https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2017/05#S ource_statements_for_items_syntesized_from_authorities_-_recommendatio ns.3F)
Feel free to ask for further information, and all the best, Joachim
-----Ursprüngliche Nachricht----- Von: Wikidata [mailto:wikidata-bounces@lists.wikimedia.org] Im Auftrag von Osma Suominen Gesendet: Montag, 21. August 2017 11:07 An: Discussion list for the Wikidata project. Betreff: [Wikidata] Some Mix'n'match mappings not stored in Wikidata?
Hi,
We're more than halfway through mapping YSO places to Wikidata. Most of the remaining are places that don't exist in Wikidata, and adding them is quite labor-intensive so we will have to consider our strategy.
Anyway, I did some checking of what remains unmapped and noticed a potential problem: some mappings for places that we have mapped using Mix'n'match have not actually been stored in Wikidata. For example Q36 Poland ("Puola" in YSO Places) is such a case. In Mix'n'match it is shown as manually matched (see attached screenshot), but in Wikidata the corresponding YSO ID property doesn't actually exist for the entity. I checked the change history of the Q36 entity and couldn't find anything relevant there, so it seems that the mapping was never stored in Wikidata. Maybe there was a transient error of some
kind?
Another such case was Q1754 Stockholm ("Tukholma" in YSO places). But for that one we removed the existing mapping in Mix'n'match and set it again, and now it is properly stored in Wikidata.
Mix'n'match currently reports 4228 mappings for YSO places, while a SPARQL query for the Wikidata endpoint returns 4221 such mappings. So I suspect that this only affects a small number of entities.
Is it possible to compare the Mix'n'match mappings with what actually exists in Wikidata, and somehow re-sync them? Or just to get the mappings out from Mix'n'match and compare them with what exists in Wikidata, so that the few missing mappings may be added there manually?
Thanks, Osma
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 osma.suominen@helsinki.fi http://www.nationallibrary.fi
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Osma Suominen D.Sc. (Tech), Information Systems Specialist National Library of Finland P.O. Box 26 (Kaikukatu 4) 00014 HELSINGIN YLIOPISTO Tel. +358 50 3199529 osma.suominen@helsinki.fi http://www.nationallibrary.fi
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata