Hello everyone,
I have a question for people who are using the Wikidata reconciliation service: https://tools.wmflabs.org/wikidata-reconcile/ It was working perfectly in my Open Refine in november 2016, but since december is stopped working. I already have contacted Magnus Manske, but he hasn’t responded yet. Does anyone else experience problems with the service and know how to fix it?
I’m using this service to link big lists of Belgian artists (37.000) and performance art organisations (1.000) to Wikidata as a preparation to upload contextual data about these persons and organisations to Wikidata. This data wil come from Kunstenpunt database (http://data.kunsten.be/people). Wikimedia user Romaine (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this project.
Best regards, Alina
-- Aanwezig ma, di, wo, do
PACKED vzw - Expertisecentrum Digitaal Erfgoed Rue Delaunoystraat 58 bus 23 B-1080 Brussel Belgium
e alina@packed.be mailto:alina@packed.be t: +32 (0)2 217 14 05 w www.packed.be http://www.packed.be/
Hi,
I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.)
There is currently an open issue (with a nice bounty) to improve the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805
Best regards, Antonin
On 26/01/2017 12:22, Alina Saenko wrote:
Hello everyone,
I have a question for people who are using the Wikidata reconciliation service: https://tools.wmflabs.org/wikidata-reconcile/ It was working perfectly in my Open Refine in november 2016, but since december is stopped working. I already have contacted Magnus Manske, but he hasn’t responded yet. Does anyone else experience problems with the service and know how to fix it?
I’m using this service to link big lists of Belgian artists (37.000) and performance art organisations (1.000) to Wikidata as a preparation to upload contextual data about these persons and organisations to Wikidata. This data wil come from Kunstenpunt database (http://data.kunsten.be/people). Wikimedia user Romaine (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this project.
Best regards, Alina
-- Aanwezig ma, di, wo, do
PACKED vzw - Expertisecentrum Digitaal Erfgoed Rue Delaunoystraat 58 bus 23 B-1080 Brussel Belgium
e alina@packed.be mailto:alina@packed.be t: +32 (0)2 217 14 05 w www.packed.be http://www.packed.be/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Alina, I just found your bug report, which you filed under the wrong issue tracker. The git repo (source code, issue tracker etc.) are here: https://bitbucket.org/magnusmanske/reconcile
The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ works perfectly fine for me.
Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks?
Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it.
Cheers, Magnus
On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) < lists@antonin.delpeuch.eu> wrote:
Hi,
I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.)
There is currently an open issue (with a nice bounty) to improve the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805
Best regards, Antonin
On 26/01/2017 12:22, Alina Saenko wrote:
Hello everyone,
I have a question for people who are using the Wikidata reconciliation service: https://tools.wmflabs.org/wikidata-reconcile/ It was working perfectly in my Open Refine in november 2016, but since december is stopped working. I already have contacted Magnus Manske, but he hasn’t responded yet. Does anyone else experience problems with the service and know how to fix it?
I’m using this service to link big lists of Belgian artists (37.000) and performance art organisations (1.000) to Wikidata as a preparation to upload contextual data about these persons and organisations to Wikidata. This data wil come from Kunstenpunt database (http://data.kunsten.be/people). Wikimedia user Romaine (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this project.
Best regards, Alina
-- Aanwezig ma, di, wo, do
PACKED vzw - Expertisecentrum Digitaal Erfgoed Rue Delaunoystraat 58 bus 23 B-1080 Brussel Belgium
e alina@packed.be mailto:alina@packed.be t: +32 (0)2 217 14 05 <+32%202%20217%2014%2005> w www.packed.be http://www.packed.be/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
If you want to match your list to Wikidata, to find which entries already exist, have you considered Mix'n'match? https://tools.wmflabs.org/mix-n-match/
You can upload your names and identifiers at https://tools.wmflabs.org/mix-n-match/import.php
There are several mechanisms in place to help with the matching. Please contact me if you need help!
On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske magnusmanske@googlemail.com wrote:
Alina, I just found your bug report, which you filed under the wrong issue tracker. The git repo (source code, issue tracker etc.) are here: https://bitbucket.org/magnusmanske/reconcile
The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ works perfectly fine for me.
Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks?
Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it.
Cheers, Magnus
On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) < lists@antonin.delpeuch.eu> wrote:
Hi,
I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.)
There is currently an open issue (with a nice bounty) to improve the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805
Best regards, Antonin
On 26/01/2017 12:22, Alina Saenko wrote:
Hello everyone,
I have a question for people who are using the Wikidata reconciliation service: https://tools.wmflabs.org/wikidata-reconcile/ It was working perfectly in my Open Refine in november 2016, but since december is stopped working. I already have contacted Magnus Manske, but he hasn’t responded yet. Does anyone else experience problems with the service and know how to fix it?
I’m using this service to link big lists of Belgian artists (37.000) and performance art organisations (1.000) to Wikidata as a preparation to upload contextual data about these persons and organisations to Wikidata. This data wil come from Kunstenpunt database (http://data.kunsten.be/people). Wikimedia user Romaine (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this project.
Best regards, Alina
-- Aanwezig ma, di, wo, do
PACKED vzw - Expertisecentrum Digitaal Erfgoed Rue Delaunoystraat 58 bus 23 B-1080 Brussel Belgium
e alina@packed.be mailto:alina@packed.be t: +32 (0)2 217 14 05 <+32%202%20217%2014%2005> w www.packed.be http://www.packed.be/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Magnus,
Mix'n'match looks great and I do have a few questions about it. I'd like to use it to import a dataset, which looks like this (these are the 100 first lines): http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt
I see how to import it in Mix'n'match, but given all the columns I have in this dataset, I think that it is a bit sad to resort to matching on the name only.
Do you see any way to do some fuzzy-matching on, say, the URLs provided in the dataset against the "official website" property? I think that it would be possible with the (proposed) Wikidata interface for OpenRefine (if I understand the UI correctly).
In this context, I think it might even be possible to confirm matches automatically (when the matches are excellent on multiple columns). As the dataset is rather large (400,000 lines) I would not really want to validate them one after the other with the web interface. So I would need a sort of batch edit. How would you do that?
Finally, once matches are found, it would be great if statements corresponding to the various columns could be created in the items (if these statements don't already exist). With the appropriate reference to the dataset, ideally.
I realise this is a lot to ask - maybe I should just write a bot.
Alina, sorry to hijack your thread. I hope my questions were general enough to be interesting for other readers.
Cheers, Antonin
On 26/01/2017 16:01, Magnus Manske wrote:
If you want to match your list to Wikidata, to find which entries already exist, have you considered Mix'n'match? https://tools.wmflabs.org/mix-n-match/
You can upload your names and identifiers at https://tools.wmflabs.org/mix-n-match/import.php
There are several mechanisms in place to help with the matching. Please contact me if you need help!
On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> wrote:
Alina, I just found your bug report, which you filed under the wrong issue tracker. The git repo (source code, issue tracker etc.) are here: https://bitbucket.org/magnusmanske/reconcile The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ works perfectly fine for me. Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks? Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it. Cheers, Magnus On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>> wrote: Hi, I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.) There is currently an open issue (with a nice bounty) to improve the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805 Best regards, Antonin On 26/01/2017 12:22, Alina Saenko wrote: > Hello everyone, > > I have a question for people who are using the Wikidata reconciliation > service: https://tools.wmflabs.org/wikidata-reconcile/ It was working > perfectly in my Open Refine in november 2016, but since december is > stopped working. I already have contacted Magnus Manske, but he hasn’t > responded yet. Does anyone else experience problems with the service and > know how to fix it? > > I’m using this service to link big lists of Belgian artists (37.000) and > performance art organisations (1.000) to Wikidata as a preparation to > upload contextual data about these persons and organisations to > Wikidata. This data wil come from Kunstenpunt database > (http://data.kunsten.be/people). Wikimedia user Romaine > (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this > project. > > Best regards, > Alina > > > -- > Aanwezig ma, di, wo, do > > PACKED vzw - Expertisecentrum Digitaal Erfgoed > Rue Delaunoystraat 58 bus 23 > B-1080 Brussel > Belgium > > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> > t: +32 (0)2 217 14 05 <tel:+32%202%20217%2014%2005> > w www.packed.be <http://www.packed.be> <http://www.packed.be/> > > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikidata > _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hey Alina,
Thanks for letting us know about this.
I'll start testing it after configuring OpenRefine(as it's API is implemented in WMF).
Can you share me the open task related to this?
Cheers, Amit Kumar Jaiswal
On 1/26/17, Antonin Delpeuch (lists) lists@antonin.delpeuch.eu wrote:
Hi Magnus,
Mix'n'match looks great and I do have a few questions about it. I'd like to use it to import a dataset, which looks like this (these are the 100 first lines): http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt
I see how to import it in Mix'n'match, but given all the columns I have in this dataset, I think that it is a bit sad to resort to matching on the name only.
Do you see any way to do some fuzzy-matching on, say, the URLs provided in the dataset against the "official website" property? I think that it would be possible with the (proposed) Wikidata interface for OpenRefine (if I understand the UI correctly).
In this context, I think it might even be possible to confirm matches automatically (when the matches are excellent on multiple columns). As the dataset is rather large (400,000 lines) I would not really want to validate them one after the other with the web interface. So I would need a sort of batch edit. How would you do that?
Finally, once matches are found, it would be great if statements corresponding to the various columns could be created in the items (if these statements don't already exist). With the appropriate reference to the dataset, ideally.
I realise this is a lot to ask - maybe I should just write a bot.
Alina, sorry to hijack your thread. I hope my questions were general enough to be interesting for other readers.
Cheers, Antonin
On 26/01/2017 16:01, Magnus Manske wrote:
If you want to match your list to Wikidata, to find which entries already exist, have you considered Mix'n'match? https://tools.wmflabs.org/mix-n-match/
You can upload your names and identifiers at https://tools.wmflabs.org/mix-n-match/import.php
There are several mechanisms in place to help with the matching. Please contact me if you need help!
On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> wrote:
Alina, I just found your bug report, which you filed under the wrong issue tracker. The git repo (source code, issue tracker etc.) are
here: https://bitbucket.org/magnusmanske/reconcile
The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ works perfectly fine for me. Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks? Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it. Cheers, Magnus On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>> wrote: Hi, I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.) There is currently an open issue (with a nice bounty) to improve
the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805
Best regards, Antonin On 26/01/2017 12:22, Alina Saenko wrote: > Hello everyone, > > I have a question for people who are using the Wikidata reconciliation > service: https://tools.wmflabs.org/wikidata-reconcile/ It was working > perfectly in my Open Refine in november 2016, but since december is > stopped working. I already have contacted Magnus Manske, but he hasn’t > responded yet. Does anyone else experience problems with the service and > know how to fix it? > > I’m using this service to link big lists of Belgian artists (37.000) and > performance art organisations (1.000) to Wikidata as a preparation to > upload contextual data about these persons and organisations to > Wikidata. This data wil come from Kunstenpunt database > (http://data.kunsten.be/people). Wikimedia user Romaine > (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this > project. > > Best regards, > Alina > > > -- > Aanwezig ma, di, wo, do > > PACKED vzw - Expertisecentrum Digitaal Erfgoed > Rue Delaunoystraat 58 bus 23 > B-1080 Brussel > Belgium > > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> > t: +32 (0)2 217 14 05 <tel:+32%202%20217%2014%2005> > w www.packed.be <http://www.packed.be> <http://www.packed.be/> > > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org
mailto:Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Everyone,
Yes, our OpenRefine API can use Multiple Query Mode (reconciling an Entity by using multiple columns/ WD properties)
https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API#mul...
I do not think that Magnus has implemented our Multiple Query Mode yet, however. The bounty issue https://github.com/OpenRefine/OpenRefine/issues/805 that I created and funded on BountySource.com is to fully implement the Mutliple Query Mode API and ensure that it works correctly in OpenRefine 2.6 RC2 latest.
Happy Hacking anyone :) Let us know if we can answer any questions regarding OpenRefine or the Reconcile API , on our own mailing list. http://groups.google.com/group/openrefine/
-Thad
On Thu, Jan 26, 2017 at 11:18 AM AMIT KUMAR JAISWAL amitkumarj441@gmail.com wrote:
Hey Alina,
Thanks for letting us know about this.
I'll start testing it after configuring OpenRefine(as it's API is implemented in WMF).
Can you share me the open task related to this?
Cheers, Amit Kumar Jaiswal
On 1/26/17, Antonin Delpeuch (lists) lists@antonin.delpeuch.eu wrote:
Hi Magnus,
Mix'n'match looks great and I do have a few questions about it. I'd like to use it to import a dataset, which looks like this (these are the 100 first lines): http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt
I see how to import it in Mix'n'match, but given all the columns I have in this dataset, I think that it is a bit sad to resort to matching on the name only.
Do you see any way to do some fuzzy-matching on, say, the URLs provided in the dataset against the "official website" property? I think that it would be possible with the (proposed) Wikidata interface for OpenRefine (if I understand the UI correctly).
In this context, I think it might even be possible to confirm matches automatically (when the matches are excellent on multiple columns). As the dataset is rather large (400,000 lines) I would not really want to validate them one after the other with the web interface. So I would need a sort of batch edit. How would you do that?
Finally, once matches are found, it would be great if statements corresponding to the various columns could be created in the items (if these statements don't already exist). With the appropriate reference to the dataset, ideally.
I realise this is a lot to ask - maybe I should just write a bot.
Alina, sorry to hijack your thread. I hope my questions were general enough to be interesting for other readers.
Cheers, Antonin
On 26/01/2017 16:01, Magnus Manske wrote:
If you want to match your list to Wikidata, to find which entries already exist, have you considered Mix'n'match? https://tools.wmflabs.org/mix-n-match/
You can upload your names and identifiers at https://tools.wmflabs.org/mix-n-match/import.php
There are several mechanisms in place to help with the matching. Please contact me if you need help!
On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com>
wrote:
Alina, I just found your bug report, which you filed under the wrong issue tracker. The git repo (source code, issue tracker etc.) are
here: https://bitbucket.org/magnusmanske/reconcile
The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ works perfectly fine for me. Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks? Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it. Cheers, Magnus On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>>
wrote:
Hi, I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.) There is currently an open issue (with a nice bounty) to improve
the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805
Best regards, Antonin On 26/01/2017 12:22, Alina Saenko wrote: > Hello everyone, > > I have a question for people who are using the Wikidata reconciliation > service: https://tools.wmflabs.org/wikidata-reconcile/ It was working > perfectly in my Open Refine in november 2016, but since december is > stopped working. I already have contacted Magnus Manske, but he hasn’t > responded yet. Does anyone else experience problems with the service and > know how to fix it? > > I’m using this service to link big lists of Belgian artists (37.000) and > performance art organisations (1.000) to Wikidata as a preparation to > upload contextual data about these persons and organisations
to
> Wikidata. This data wil come from Kunstenpunt database > (http://data.kunsten.be/people). Wikimedia user Romaine > (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this > project. > > Best regards, > Alina > > > -- > Aanwezig ma, di, wo, do > > PACKED vzw - Expertisecentrum Digitaal Erfgoed > Rue Delaunoystraat 58 bus 23 > B-1080 Brussel > Belgium > > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> > t: +32 (0)2 217 14 05 <+32%202%20217%2014%2005>
> w www.packed.be <http://www.packed.be> <http://www.packed.be/
> > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org
mailto:Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:
Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Amit Kumar Jaiswal Mozilla Representative : http://reps.mozilla.org/u/amitkumarj441 Kanpur | Uttar Pradesh | India Contact No : +91-8081187743 <+91%2080811%2087743> Web : http://amitkumarj441.github.io | Twitter : @AMIT_GKP LinkedIn : http://in.linkedin.com/in/amitkumarjaiswal1 PGP Key : EBE7 39F0 0427 4A2C
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Another alternative (and, apologies if you consider it off-topic) are the Wikipedia and Wikidata Tools for Google Spreadsheets https://chrome.google.com/webstore/detail/wikipedia-and-wikidata-to/aiilcelhmpllcgkhhpifagfehbddkdfp?utm_source=permalink (made by Dr. Thomas Steiner of this group). I'm working on about 250,000 names (in multiple sheets), most with life dates. This add-on matches just on the name & returns the QID, which you can then use to pull back Wikidata's birth (P569) & death (P570) dates. After a bit of cleaning of that data, I can instantly tell whether it's matched the correct name. It's been very useful for matching "low hanging fruit". And perhaps one of the other options above is more appropriate for the remaining, more difficult matches.
David
*David Lowe | The New York Public Library**Specialist II, Photography Collection*
*Photographers' Identities Catalog http://pic.nypl.org*
On Thu, Jan 26, 2017 at 1:00 PM, Thad Guidry thadguidry@gmail.com wrote:
Everyone,
Yes, our OpenRefine API can use Multiple Query Mode (reconciling an Entity by using multiple columns/ WD properties)
https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API# multiple-query-mode
I do not think that Magnus has implemented our Multiple Query Mode yet, however. The bounty issue https://github.com/OpenRefine/OpenRefine/issues/805 that I created and funded on BountySource.com is to fully implement the Mutliple Query Mode API and ensure that it works correctly in OpenRefine 2.6 RC2 latest.
Happy Hacking anyone :) Let us know if we can answer any questions regarding OpenRefine or the Reconcile API , on our own mailing list. http://groups.google.com/group/openrefine/
-Thad
On Thu, Jan 26, 2017 at 11:18 AM AMIT KUMAR JAISWAL < amitkumarj441@gmail.com> wrote:
Hey Alina,
Thanks for letting us know about this.
I'll start testing it after configuring OpenRefine(as it's API is implemented in WMF).
Can you share me the open task related to this?
Cheers, Amit Kumar Jaiswal
On 1/26/17, Antonin Delpeuch (lists) lists@antonin.delpeuch.eu wrote:
Hi Magnus,
Mix'n'match looks great and I do have a few questions about it. I'd like to use it to import a dataset, which looks like this (these are the 100 first lines): http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt
I see how to import it in Mix'n'match, but given all the columns I have in this dataset, I think that it is a bit sad to resort to matching on the name only.
Do you see any way to do some fuzzy-matching on, say, the URLs provided in the dataset against the "official website" property? I think that it would be possible with the (proposed) Wikidata interface for OpenRefine (if I understand the UI correctly).
In this context, I think it might even be possible to confirm matches automatically (when the matches are excellent on multiple columns). As the dataset is rather large (400,000 lines) I would not really want to validate them one after the other with the web interface. So I would need a sort of batch edit. How would you do that?
Finally, once matches are found, it would be great if statements corresponding to the various columns could be created in the items (if these statements don't already exist). With the appropriate reference to the dataset, ideally.
I realise this is a lot to ask - maybe I should just write a bot.
Alina, sorry to hijack your thread. I hope my questions were general enough to be interesting for other readers.
Cheers, Antonin
On 26/01/2017 16:01, Magnus Manske wrote:
If you want to match your list to Wikidata, to find which entries already exist, have you considered Mix'n'match? https://tools.wmflabs.org/mix-n-match/
You can upload your names and identifiers at https://tools.wmflabs.org/mix-n-match/import.php
There are several mechanisms in place to help with the matching. Please contact me if you need help!
On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com>
wrote:
Alina, I just found your bug report, which you filed under the
wrong
issue tracker. The git repo (source code, issue tracker etc.) are
here: https://bitbucket.org/magnusmanske/reconcile
The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ works perfectly fine for me. Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks? Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it. Cheers, Magnus On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>>
wrote:
Hi, I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.) There is currently an open issue (with a nice bounty) to
improve
the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805
Best regards, Antonin On 26/01/2017 12:22, Alina Saenko wrote: > Hello everyone, > > I have a question for people who are using the Wikidata reconciliation > service: https://tools.wmflabs.org/wikidata-reconcile/ It
was
working > perfectly in my Open Refine in november 2016, but since december is > stopped working. I already have contacted Magnus Manske, but he hasn’t > responded yet. Does anyone else experience problems with the service and > know how to fix it? > > I’m using this service to link big lists of Belgian artists (37.000) and > performance art organisations (1.000) to Wikidata as a preparation to > upload contextual data about these persons and organisations
to
> Wikidata. This data wil come from Kunstenpunt database > (http://data.kunsten.be/people). Wikimedia user Romaine > (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this > project. > > Best regards, > Alina > > > -- > Aanwezig ma, di, wo, do > > PACKED vzw - Expertisecentrum Digitaal Erfgoed > Rue Delaunoystraat 58 bus 23 > B-1080 Brussel > Belgium > > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> > t: +32 (0)2 217 14 05 <+32%202%20217%2014%2005>
> w www.packed.be <http://www.packed.be> <
> > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org
mailto:Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.
wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Amit Kumar Jaiswal Mozilla Representative : http://reps.mozilla.org/u/amitkumarj441 Kanpur | Uttar Pradesh | India Contact No : +91-8081187743 <+91%2080811%2087743> Web : http://amitkumarj441.github.io | Twitter : @AMIT_GKP LinkedIn : http://in.linkedin.com/in/amitkumarjaiswal1 PGP Key : EBE7 39F0 0427 4A2C
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
+1 from someone who would be so extremely happy (and much more productive) if such a service were implemented in OpenRefine.
I also added it as a task to Phabricator, feel free to comment, add suggestions… https://phabricator.wikimedia.org/T146740 https://phabricator.wikimedia.org/T146740
Best, Sandra/User:Spinster
On 26 Jan 2017, at 19:00, Thad Guidry thadguidry@gmail.com wrote:
Everyone,
Yes, our OpenRefine API can use Multiple Query Mode (reconciling an Entity by using multiple columns/ WD properties)
https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API#mul... https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API#multiple-query-mode I do not think that Magnus has implemented our Multiple Query Mode yet, however. The bounty issue https://github.com/OpenRefine/OpenRefine/issues/805 https://github.com/OpenRefine/OpenRefine/issues/805 that I created and funded on BountySource.com is to fully implement the Mutliple Query Mode API and ensure that it works correctly in OpenRefine 2.6 RC2 latest.
Happy Hacking anyone :) Let us know if we can answer any questions regarding OpenRefine or the Reconcile API , on our own mailing list. http://groups.google.com/group/openrefine/ http://groups.google.com/group/openrefine/
-Thad
On Thu, Jan 26, 2017 at 11:18 AM AMIT KUMAR JAISWAL <amitkumarj441@gmail.com mailto:amitkumarj441@gmail.com> wrote: Hey Alina,
Thanks for letting us know about this.
I'll start testing it after configuring OpenRefine(as it's API is implemented in WMF).
Can you share me the open task related to this?
Cheers, Amit Kumar Jaiswal
On 1/26/17, Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu mailto:lists@antonin.delpeuch.eu> wrote:
Hi Magnus,
Mix'n'match looks great and I do have a few questions about it. I'd like to use it to import a dataset, which looks like this (these are the 100 first lines): http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt
I see how to import it in Mix'n'match, but given all the columns I have in this dataset, I think that it is a bit sad to resort to matching on the name only.
Do you see any way to do some fuzzy-matching on, say, the URLs provided in the dataset against the "official website" property? I think that it would be possible with the (proposed) Wikidata interface for OpenRefine (if I understand the UI correctly).
In this context, I think it might even be possible to confirm matches automatically (when the matches are excellent on multiple columns). As the dataset is rather large (400,000 lines) I would not really want to validate them one after the other with the web interface. So I would need a sort of batch edit. How would you do that?
Finally, once matches are found, it would be great if statements corresponding to the various columns could be created in the items (if these statements don't already exist). With the appropriate reference to the dataset, ideally.
I realise this is a lot to ask - maybe I should just write a bot.
Alina, sorry to hijack your thread. I hope my questions were general enough to be interesting for other readers.
Cheers, Antonin
On 26/01/2017 16:01, Magnus Manske wrote:
If you want to match your list to Wikidata, to find which entries already exist, have you considered Mix'n'match? https://tools.wmflabs.org/mix-n-match/ https://tools.wmflabs.org/mix-n-match/
You can upload your names and identifiers at https://tools.wmflabs.org/mix-n-match/import.php https://tools.wmflabs.org/mix-n-match/import.php
There are several mechanisms in place to help with the matching. Please contact me if you need help!
On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com <mailto:magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com>> wrote:
Alina, I just found your bug report, which you filed under the wrong issue tracker. The git repo (source code, issue tracker etc.) are
here: https://bitbucket.org/magnusmanske/reconcile https://bitbucket.org/magnusmanske/reconcile
The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ <https://tools.wmflabs.org/wikidata-reconcile/> works perfectly fine for me. Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks? Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it. Cheers, Magnus On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu> <mailto:lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>>> wrote: Hi, I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.) There is currently an open issue (with a nice bounty) to improve
the integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805 https://github.com/OpenRefine/OpenRefine/issues/805
Best regards, Antonin On 26/01/2017 12:22, Alina Saenko wrote: > Hello everyone, > > I have a question for people who are using the Wikidata reconciliation > service: https://tools.wmflabs.org/wikidata-reconcile/ <https://tools.wmflabs.org/wikidata-reconcile/> It was working > perfectly in my Open Refine in november 2016, but since december is > stopped working. I already have contacted Magnus Manske, but he hasn’t > responded yet. Does anyone else experience problems with the service and > know how to fix it? > > I’m using this service to link big lists of Belgian artists (37.000) and > performance art organisations (1.000) to Wikidata as a preparation to > upload contextual data about these persons and organisations to > Wikidata. This data wil come from Kunstenpunt database > (http://data.kunsten.be/people <http://data.kunsten.be/people>). Wikimedia user Romaine > (https://meta.wikimedia.org/wiki/User:Romaine <https://meta.wikimedia.org/wiki/User:Romaine>) is helping us with this > project. > > Best regards, > Alina > > > -- > Aanwezig ma, di, wo, do > > PACKED vzw - Expertisecentrum Digitaal Erfgoed > Rue Delaunoystraat 58 bus 23 > B-1080 Brussel > Belgium > > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> <mailto:alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>>> > t: +32 (0)2 217 14 05 <tel:+32%202%20217%2014%2005> <tel:+32%202%20217%2014%2005> > w www.packed.be <http://www.packed.be/> <http://www.packed.be <http://www.packed.be/>> <http://www.packed.be/ <http://www.packed.be/>> > > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
<mailto:Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> <mailto:Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata>
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Amit Kumar Jaiswal Mozilla Representative : http://reps.mozilla.org/u/amitkumarj441 http://reps.mozilla.org/u/amitkumarj441 Kanpur | Uttar Pradesh | India Contact No : +91-8081187743 tel:+91%2080811%2087743 Web : http://amitkumarj441.github.io http://amitkumarj441.github.io/ | Twitter : @AMIT_GKP LinkedIn : http://in.linkedin.com/in/amitkumarjaiswal1 http://in.linkedin.com/in/amitkumarjaiswal1 PGP Key : EBE7 39F0 0427 4A2C
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I have just rounded up the bounty to $300. This is a dream feature, we need it! :)
Antonin
On 27/01/2017 13:12, Sandra Fauconnier wrote:
+1 from someone who would be so extremely happy (and much more productive) if such a service were implemented in OpenRefine.
I also added it as a task to Phabricator, feel free to comment, add suggestions… https://phabricator.wikimedia.org/T146740
Best, Sandra/User:Spinster
On 26 Jan 2017, at 19:00, Thad Guidry <thadguidry@gmail.com mailto:thadguidry@gmail.com> wrote:
Everyone,
Yes, our OpenRefine API can use Multiple Query Mode (reconciling an Entity by using multiple columns/ WD properties)
https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API#mul...
I do not think that Magnus has implemented our Multiple Query Mode yet, however. The bounty issue https://github.com/OpenRefine/OpenRefine/issues/805 that I created and funded on BountySource.com http://BountySource.com is to fully implement the Mutliple Query Mode API and ensure that it works correctly in OpenRefine 2.6 RC2 latest.
Happy Hacking anyone :) Let us know if we can answer any questions regarding OpenRefine or the Reconcile API , on our own mailing list. http://groups.google.com/group/openrefine/
-Thad
On Thu, Jan 26, 2017 at 11:18 AM AMIT KUMAR JAISWAL <amitkumarj441@gmail.com mailto:amitkumarj441@gmail.com> wrote:
Hey Alina, Thanks for letting us know about this. I'll start testing it after configuring OpenRefine(as it's API is implemented in WMF). Can you share me the open task related to this? Cheers, Amit Kumar Jaiswal On 1/26/17, Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>> wrote: > Hi Magnus, > > Mix'n'match looks great and I do have a few questions about it. I'd like > to use it to import a dataset, which looks like this (these are the 100 > first lines): > http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt > > I see how to import it in Mix'n'match, but given all the columns I have > in this dataset, I think that it is a bit sad to resort to matching on > the name only. > > Do you see any way to do some fuzzy-matching on, say, the URLs provided > in the dataset against the "official website" property? I think that it > would be possible with the (proposed) Wikidata interface for OpenRefine > (if I understand the UI correctly). > > In this context, I think it might even be possible to confirm matches > automatically (when the matches are excellent on multiple columns). As > the dataset is rather large (400,000 lines) I would not really want to > validate them one after the other with the web interface. So I would > need a sort of batch edit. How would you do that? > > Finally, once matches are found, it would be great if statements > corresponding to the various columns could be created in the items (if > these statements don't already exist). With the appropriate reference to > the dataset, ideally. > > I realise this is a lot to ask - maybe I should just write a bot. > > Alina, sorry to hijack your thread. I hope my questions were general > enough to be interesting for other readers. > > Cheers, > Antonin > > > On 26/01/2017 16:01, Magnus Manske wrote: >> If you want to match your list to Wikidata, to find which entries >> already exist, have you considered Mix'n'match? >> https://tools.wmflabs.org/mix-n-match/ >> >> You can upload your names and identifiers at >> https://tools.wmflabs.org/mix-n-match/import.php >> >> There are several mechanisms in place to help with the matching. Please >> contact me if you need help! >> >> On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske >> <magnusmanske@googlemail.com <mailto:magnusmanske@googlemail.com> <mailto:magnusmanske@googlemail.com <mailto:magnusmanske@googlemail.com>>> wrote: >> >> Alina, I just found your bug report, which you filed under the wrong >> issue tracker. The git repo (source code, issue tracker etc.) are >> here: >> https://bitbucket.org/magnusmanske/reconcile >> >> The report says it "keeps hanging", which is so vague that it's >> impossible to debug, especially since the example linked on >> https://tools.wmflabs.org/wikidata-reconcile/ >> works perfectly fine for me. >> >> Does it not work at all for you? Does it work for a time, but then >> stops? Does it "break" reproducibly on specific queries, or at >> random? Maybe it breaks for specific "types" only? At what rate are >> you hitting the tool? Do you have an example query, preferably one >> that breaks? >> >> Please note that this is not an "official" WMF service, only parts >> of the API are implemented, and there are currently other technical >> limitations on it. >> >> Cheers, >> Magnus >> >> On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) >> <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu> <mailto:lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>>> wrote: >> >> Hi, >> >> I'm also very interested in this. How did you configure your >> OpenRefine >> to use Wikidata? (Even if it does not currently work, I am >> interested in >> the setup.) >> >> There is currently an open issue (with a nice bounty) to improve >> the >> integration of Wikidata in OpenRefine: >> https://github.com/OpenRefine/OpenRefine/issues/805 >> >> Best regards, >> Antonin >> >> On 26/01/2017 12:22, Alina Saenko wrote: >> > Hello everyone, >> > >> > I have a question for people who are using the Wikidata >> reconciliation >> > service: https://tools.wmflabs.org/wikidata-reconcile/ It was >> working >> > perfectly in my Open Refine in november 2016, but since >> december is >> > stopped working. I already have contacted Magnus Manske, but >> he hasn’t >> > responded yet. Does anyone else experience problems with the >> service and >> > know how to fix it? >> > >> > I’m using this service to link big lists of Belgian artists >> (37.000) and >> > performance art organisations (1.000) to Wikidata as a >> preparation to >> > upload contextual data about these persons and organisations to >> > Wikidata. This data wil come from Kunstenpunt database >> > (http://data.kunsten.be/people). Wikimedia user Romaine >> > (https://meta.wikimedia.org/wiki/User:Romaine) is helping us >> with this >> > project. >> > >> > Best regards, >> > Alina >> > >> > >> > -- >> > Aanwezig ma, di, wo, do >> > >> > PACKED vzw - Expertisecentrum Digitaal Erfgoed >> > Rue Delaunoystraat 58 bus 23 >> > B-1080 Brussel >> > Belgium >> > >> > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> >> <mailto:alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>>> >> > t: +32 (0)2 217 14 05 <tel:+32%202%20217%2014%2005> <tel:+32%202%20217%2014%2005> >> > w www.packed.be <http://www.packed.be/> <http://www.packed.be <http://www.packed.be/>> <http://www.packed.be/> >> > >> > >> > >> > _______________________________________________ >> > Wikidata mailing list >> > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> >> <mailto:Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata >> > >> >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> <mailto:Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>> >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> >> >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikidata > -- Amit Kumar Jaiswal Mozilla Representative : http://reps.mozilla.org/u/amitkumarj441 Kanpur | Uttar Pradesh | India Contact No : +91-8081187743 <tel:+91%2080811%2087743> Web : http://amitkumarj441.github.io <http://amitkumarj441.github.io/> | Twitter : @AMIT_GKP LinkedIn : http://in.linkedin.com/in/amitkumarjaiswal1 PGP Key : EBE7 39F0 0427 4A2C _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Antonin,
mix'n'match is designed to work with almost any dataset, thus uses the common denominator, which is names, for matching.
There are mechanisms to match on other properties, but writing an interface for public consumption for this would be a task that could easily keep an entire team of programmers busy :-)
If you can give me the whole list to download, I will see what I can do in terms of auxiliary data matching. Maybe a combination of that, manual matches (or at least confirmations on name matches), and the OpenRefine approach will give us maximum coverage.
It appears Kunstenpunt has no Wikidata property yet. Maybe Romaine could star setting one up? That would help in terms of synchronisation, I believe.
Cheers, Magnus
On Thu, Jan 26, 2017 at 4:44 PM Antonin Delpeuch (lists) < lists@antonin.delpeuch.eu> wrote:
Hi Magnus,
Mix'n'match looks great and I do have a few questions about it. I'd like to use it to import a dataset, which looks like this (these are the 100 first lines): http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt
I see how to import it in Mix'n'match, but given all the columns I have in this dataset, I think that it is a bit sad to resort to matching on the name only.
Do you see any way to do some fuzzy-matching on, say, the URLs provided in the dataset against the "official website" property? I think that it would be possible with the (proposed) Wikidata interface for OpenRefine (if I understand the UI correctly).
In this context, I think it might even be possible to confirm matches automatically (when the matches are excellent on multiple columns). As the dataset is rather large (400,000 lines) I would not really want to validate them one after the other with the web interface. So I would need a sort of batch edit. How would you do that?
Finally, once matches are found, it would be great if statements corresponding to the various columns could be created in the items (if these statements don't already exist). With the appropriate reference to the dataset, ideally.
I realise this is a lot to ask - maybe I should just write a bot.
Alina, sorry to hijack your thread. I hope my questions were general enough to be interesting for other readers.
Cheers, Antonin
On 26/01/2017 16:01, Magnus Manske wrote:
If you want to match your list to Wikidata, to find which entries already exist, have you considered Mix'n'match? https://tools.wmflabs.org/mix-n-match/
You can upload your names and identifiers at https://tools.wmflabs.org/mix-n-match/import.php
There are several mechanisms in place to help with the matching. Please contact me if you need help!
On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com>
wrote:
Alina, I just found your bug report, which you filed under the wrong issue tracker. The git repo (source code, issue tracker etc.) are
here:
https://bitbucket.org/magnusmanske/reconcile The report says it "keeps hanging", which is so vague that it's impossible to debug, especially since the example linked on https://tools.wmflabs.org/wikidata-reconcile/ works perfectly fine for me. Does it not work at all for you? Does it work for a time, but then stops? Does it "break" reproducibly on specific queries, or at random? Maybe it breaks for specific "types" only? At what rate are you hitting the tool? Do you have an example query, preferably one that breaks? Please note that this is not an "official" WMF service, only parts of the API are implemented, and there are currently other technical limitations on it. Cheers, Magnus On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>>
wrote:
Hi, I'm also very interested in this. How did you configure your OpenRefine to use Wikidata? (Even if it does not currently work, I am interested in the setup.) There is currently an open issue (with a nice bounty) to improve
the
integration of Wikidata in OpenRefine: https://github.com/OpenRefine/OpenRefine/issues/805 Best regards, Antonin On 26/01/2017 12:22, Alina Saenko wrote: > Hello everyone, > > I have a question for people who are using the Wikidata reconciliation > service: https://tools.wmflabs.org/wikidata-reconcile/ It was working > perfectly in my Open Refine in november 2016, but since december is > stopped working. I already have contacted Magnus Manske, but he hasn’t > responded yet. Does anyone else experience problems with the service and > know how to fix it? > > I’m using this service to link big lists of Belgian artists (37.000) and > performance art organisations (1.000) to Wikidata as a preparation to > upload contextual data about these persons and organisations to > Wikidata. This data wil come from Kunstenpunt database > (http://data.kunsten.be/people). Wikimedia user Romaine > (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this > project. > > Best regards, > Alina > > > -- > Aanwezig ma, di, wo, do > > PACKED vzw - Expertisecentrum Digitaal Erfgoed > Rue Delaunoystraat 58 bus 23 > B-1080 Brussel > Belgium > > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> > t: +32 (0)2 217 14 05 <+32%202%20217%2014%2005>
> w www.packed.be <http://www.packed.be> <http://www.packed.be/> > > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:
Wikidata@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/wikidata > _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:
Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Magnus,
The dataset is essentially this one: http://isni.ringgold.com/
I am currently augmenting it with Ringgold IDs (P3500) using ORCID (this should be completed in a few days). This alignment only adds two columns (Ringgold ID and organization type) which should not impact the task of matching it with Wikidata (as there are virtually no Ringgold IDs in Wikidata yet).
Cheers, Antonin
On 27/01/2017 09:18, Magnus Manske wrote:
Hi Antonin,
mix'n'match is designed to work with almost any dataset, thus uses the common denominator, which is names, for matching.
There are mechanisms to match on other properties, but writing an interface for public consumption for this would be a task that could easily keep an entire team of programmers busy :-)
If you can give me the whole list to download, I will see what I can do in terms of auxiliary data matching. Maybe a combination of that, manual matches (or at least confirmations on name matches), and the OpenRefine approach will give us maximum coverage.
It appears Kunstenpunt has no Wikidata property yet. Maybe Romaine could star setting one up? That would help in terms of synchronisation, I believe.
Cheers, Magnus
On Thu, Jan 26, 2017 at 4:44 PM Antonin Delpeuch (lists) <lists@antonin.delpeuch.eu mailto:lists@antonin.delpeuch.eu> wrote:
Hi Magnus, Mix'n'match looks great and I do have a few questions about it. I'd like to use it to import a dataset, which looks like this (these are the 100 first lines): http://pintoch.ulminfo.fr/34f8c4cf8a/aligned_institutions.txt I see how to import it in Mix'n'match, but given all the columns I have in this dataset, I think that it is a bit sad to resort to matching on the name only. Do you see any way to do some fuzzy-matching on, say, the URLs provided in the dataset against the "official website" property? I think that it would be possible with the (proposed) Wikidata interface for OpenRefine (if I understand the UI correctly). In this context, I think it might even be possible to confirm matches automatically (when the matches are excellent on multiple columns). As the dataset is rather large (400,000 lines) I would not really want to validate them one after the other with the web interface. So I would need a sort of batch edit. How would you do that? Finally, once matches are found, it would be great if statements corresponding to the various columns could be created in the items (if these statements don't already exist). With the appropriate reference to the dataset, ideally. I realise this is a lot to ask - maybe I should just write a bot. Alina, sorry to hijack your thread. I hope my questions were general enough to be interesting for other readers. Cheers, Antonin On 26/01/2017 16:01, Magnus Manske wrote: > If you want to match your list to Wikidata, to find which entries > already exist, have you considered Mix'n'match? > https://tools.wmflabs.org/mix-n-match/ > > You can upload your names and identifiers at > https://tools.wmflabs.org/mix-n-match/import.php > > There are several mechanisms in place to help with the matching. Please > contact me if you need help! > > On Thu, Jan 26, 2017 at 3:58 PM Magnus Manske > <magnusmanske@googlemail.com <mailto:magnusmanske@googlemail.com> <mailto:magnusmanske@googlemail.com <mailto:magnusmanske@googlemail.com>>> wrote: > > Alina, I just found your bug report, which you filed under the wrong > issue tracker. The git repo (source code, issue tracker etc.) are here: > https://bitbucket.org/magnusmanske/reconcile > > The report says it "keeps hanging", which is so vague that it's > impossible to debug, especially since the example linked on > https://tools.wmflabs.org/wikidata-reconcile/ > works perfectly fine for me. > > Does it not work at all for you? Does it work for a time, but then > stops? Does it "break" reproducibly on specific queries, or at > random? Maybe it breaks for specific "types" only? At what rate are > you hitting the tool? Do you have an example query, preferably one > that breaks? > > Please note that this is not an "official" WMF service, only parts > of the API are implemented, and there are currently other technical > limitations on it. > > Cheers, > Magnus > > On Thu, Jan 26, 2017 at 3:35 PM Antonin Delpeuch (lists) > <lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu> <mailto:lists@antonin.delpeuch.eu <mailto:lists@antonin.delpeuch.eu>>> wrote: > > Hi, > > I'm also very interested in this. How did you configure your > OpenRefine > to use Wikidata? (Even if it does not currently work, I am > interested in > the setup.) > > There is currently an open issue (with a nice bounty) to improve the > integration of Wikidata in OpenRefine: > https://github.com/OpenRefine/OpenRefine/issues/805 > > Best regards, > Antonin > > On 26/01/2017 12:22, Alina Saenko wrote: > > Hello everyone, > > > > I have a question for people who are using the Wikidata > reconciliation > > service: https://tools.wmflabs.org/wikidata-reconcile/ It was > working > > perfectly in my Open Refine in november 2016, but since > december is > > stopped working. I already have contacted Magnus Manske, but > he hasn’t > > responded yet. Does anyone else experience problems with the > service and > > know how to fix it? > > > > I’m using this service to link big lists of Belgian artists > (37.000) and > > performance art organisations (1.000) to Wikidata as a > preparation to > > upload contextual data about these persons and organisations to > > Wikidata. This data wil come from Kunstenpunt database > > (http://data.kunsten.be/people). Wikimedia user Romaine > > (https://meta.wikimedia.org/wiki/User:Romaine) is helping us > with this > > project. > > > > Best regards, > > Alina > > > > > > -- > > Aanwezig ma, di, wo, do > > > > PACKED vzw - Expertisecentrum Digitaal Erfgoed > > Rue Delaunoystraat 58 bus 23 > > B-1080 Brussel > > Belgium > > > > e alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>> > <mailto:alina@packed.be <mailto:alina@packed.be> <mailto:alina@packed.be <mailto:alina@packed.be>>> > > t: +32 (0)2 217 14 05 <tel:+32%202%20217%2014%2005> <tel:+32%202%20217%2014%2005> > > w www.packed.be <http://www.packed.be> <http://www.packed.be> <http://www.packed.be/> > > > > > > > > _______________________________________________ > > Wikidata mailing list > > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> <mailto:Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>> > > https://lists.wikimedia.org/mailman/listinfo/wikidata > > > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> <mailto:Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>> > https://lists.wikimedia.org/mailman/listinfo/wikidata > > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> > https://lists.wikimedia.org/mailman/listinfo/wikidata > _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hello, In the past I have successfully been cloning and building the query service:
sudo git clone https://gerrit.wikimedia.org/r/wikidata/query/rdf ${WORKING_FOLDER}/wikidata-query-rdf mvn package
But recenlty I get this error
[ERROR] Failed to execute goal on project blazegraph-service: Could not resolve dependencies for project org.wikidata.query.rdf:blazegraph-service:war:0.2.4-SNAPSHOT: Could not transfer artifact org.linkeddatafragments:ldfserver:war:0.1.1-wmf2 from/to wmf.mirrored (http://archiva.wikimedia.org/repository/mirrored): peer not authenticated -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionExcepti...
I wonder if anyone have the same problem?
Hi!
But recenlty I get this error
[ERROR] Failed to execute goal on project blazegraph-service: Could not resolve dependencies for project org.wikidata.query.rdf:blazegraph-service:war:0.2.4-SNAPSHOT: Could not transfer artifact org.linkeddatafragments:ldfserver:war:0.1.1-wmf2 from/to wmf.mirrored (http://archiva.wikimedia.org/repository/mirrored): peer not authenticated -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionExcepti...
I wonder if anyone have the same problem?
Yes, I had the same. It's because Java did not have the root certificate that Let's Encrypt is using, and WMF Archiva is using such certificate now. I think Java upgrade should solve it, but if not, you need to manually add their cert (should be available on the site) using keytool. There are a number of guides on how to do it available on stack overflow and otherwise, though I don't have the link right now (did it a week or so ago). If you don't find it, ping me I'll try to dig it up.
Updating Java was sufficient in my case.
thanks -J
On Feb 1, 2017, at 12:53 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
But recenlty I get this error
[ERROR] Failed to execute goal on project blazegraph-service: Could not resolve dependencies for project org.wikidata.query.rdf:blazegraph-service:war:0.2.4-SNAPSHOT: Could not transfer artifact org.linkeddatafragments:ldfserver:war:0.1.1-wmf2 from/to wmf.mirrored (http://archiva.wikimedia.org/repository/mirrored): peer not authenticated -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionExcepti...
I wonder if anyone have the same problem?
Yes, I had the same. It's because Java did not have the root certificate that Let's Encrypt is using, and WMF Archiva is using such certificate now. I think Java upgrade should solve it, but if not, you need to manually add their cert (should be available on the site) using keytool. There are a number of guides on how to do it available on stack overflow and otherwise, though I don't have the link right now (did it a week or so ago). If you don't find it, ping me I'll try to dig it up.
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi all,
A new reconciliation service is waiting for its beta-testers: https://tools.wmflabs.org/openrefine-wikidata/
It has a bunch of new features: - matching OpenRefine columns with Wikidata properties; - auto-complete on types, items and properties; - item previews, with pictures if the items provide them.
The scoring method is not great, though. I would be very interested in any pointers to the relevant literature (combining fuzzy matching scores of multiple columns).
Cheers, Antonin
On 26/01/2017 12:22, Alina Saenko wrote:
Hello everyone,
I have a question for people who are using the Wikidata reconciliation service: https://tools.wmflabs.org/wikidata-reconcile/ It was working perfectly in my Open Refine in november 2016, but since december is stopped working. I already have contacted Magnus Manske, but he hasn’t responded yet. Does anyone else experience problems with the service and know how to fix it?
I’m using this service to link big lists of Belgian artists (37.000) and performance art organisations (1.000) to Wikidata as a preparation to upload contextual data about these persons and organisations to Wikidata. This data wil come from Kunstenpunt database (http://data.kunsten.be/people). Wikimedia user Romaine (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this project.
Best regards, Alina
-- Aanwezig ma, di, wo, do
PACKED vzw - Expertisecentrum Digitaal Erfgoed Rue Delaunoystraat 58 bus 23 B-1080 Brussel Belgium
e alina@packed.be mailto:alina@packed.be t: +32 (0)2 217 14 05 w www.packed.be http://www.packed.be/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Great work so far Antonion !
On Sun, Feb 5, 2017 at 5:18 PM Antonin Delpeuch (lists) < lists@antonin.delpeuch.eu> wrote:
Hi all,
A new reconciliation service is waiting for its beta-testers: https://tools.wmflabs.org/openrefine-wikidata/
It has a bunch of new features:
- matching OpenRefine columns with Wikidata properties;
- auto-complete on types, items and properties;
- item previews, with pictures if the items provide them.
The scoring method is not great, though. I would be very interested in any pointers to the relevant literature (combining fuzzy matching scores of multiple columns).
Cheers, Antonin
On 26/01/2017 12:22, Alina Saenko wrote:
Hello everyone,
I have a question for people who are using the Wikidata reconciliation service: https://tools.wmflabs.org/wikidata-reconcile/ It was working perfectly in my Open Refine in november 2016, but since december is stopped working. I already have contacted Magnus Manske, but he hasn’t responded yet. Does anyone else experience problems with the service and know how to fix it?
I’m using this service to link big lists of Belgian artists (37.000) and performance art organisations (1.000) to Wikidata as a preparation to upload contextual data about these persons and organisations to Wikidata. This data wil come from Kunstenpunt database (http://data.kunsten.be/people). Wikimedia user Romaine (https://meta.wikimedia.org/wiki/User:Romaine) is helping us with this project.
Best regards, Alina
-- Aanwezig ma, di, wo, do
PACKED vzw - Expertisecentrum Digitaal Erfgoed Rue Delaunoystraat 58 bus 23 B-1080 Brussel Belgium
e alina@packed.be mailto:alina@packed.be t: +32 (0)2 217 14 05 <+32%202%20217%2014%2005> w www.packed.be http://www.packed.be/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata