I finally found the time to play extensively with Mix’n’match and it’s by far one of the most promising models I’ve come across for Wikidata growth. A short conversation with Magnus on Twitter got me thinking on how to best preserve the output of costly human curation.[1]
I spent most of my time manually auditing automatically matched entries from the Dizionario Biografico degli Italiani [2]. These entries are long, unstructured biographical entries and it takes quite a lot of effort to understand if the two individuals referenced by Wikidata and DBI actually are the same person. This is a great example of a task that’s still pretty hard for a machine to perform, no matter how sophisticated the algorithm.
My favorite example? Mix’n’ match suggested a match between Giulio Baldigara (Q1010811 https://www.wikidata.org/wiki/Q1010811) and Giulio Baldigara (DBI http://www.treccani.it/enciclopedia/giulio-baldigara_(Dizionario_Biografico)/) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is Giulio Cesare Baldigara (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
My question at the moment is: the output of a labor-intensive review of a potential match is currently stored as a volatile flag in a tool hosted on labs, but is invisible in Wikidata. Should something happen to Mix’n’match (god forbid) the result of my work would get lost. Which got me thinking:
- shouldn’t a manually unmatched item be created directly on Wikidata (after all DBI is all about notable individuals who would easily pass Wikidata’s notability threshold for biographies) - shouldn’t the relation between Giulio (Cesare) Baldigara (Q1010811 https://www.wikidata.org/wiki/Q1010811) and the newly created item for Giulio Baldigara be explicitly represented via a not the same as property, to prevent future humans or machines from accidentally remerging the two items based on some kind of heuristics
Thoughts welcome,
Dario
[1] https://twitter.com/ReaderMeter/status/667214565621432320 https://twitter.com/ReaderMeter/status/667214565621432320 [2] https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse... https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offset=0&show_noq=0&show_autoq=1&show_userq=0&show_na=0
Hoi, Yes you can add an item for the missing brother. When you do, you should link it to his brother and thereby they are explicitly not the same. They can both have the same alias. It helps when you add pertinent data like a date of birth/death. I take it they are not twins. Thanks, GerardM
On 21 November 2015 at 18:34, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
I finally found the time to play extensively with Mix’n’match and it’s by far one of the most promising models I’ve come across for Wikidata growth. A short conversation with Magnus on Twitter got me thinking on how to best preserve the output of costly human curation.[1]
I spent most of my time manually auditing automatically matched entries from the Dizionario Biografico degli Italiani [2]. These entries are long, unstructured biographical entries and it takes quite a lot of effort to understand if the two individuals referenced by Wikidata and DBI actually are the same person. This is a great example of a task that’s still pretty hard for a machine to perform, no matter how sophisticated the algorithm.
My favorite example? Mix’n’ match suggested a match between *Giulio Baldigara *(Q1010811 https://www.wikidata.org/wiki/Q1010811) and *Giulio Baldigara* (DBI http://www.treccani.it/enciclopedia/giulio-baldigara_(Dizionario_Biografico)/) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is *Giulio Cesare Baldigara* (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
My question at the moment is: the output of a labor-intensive review of a potential match is currently stored as a volatile flag in a tool hosted on labs, but is invisible in Wikidata. Should something happen to Mix’n’match (god forbid) the result of my work would get lost. Which got me thinking:
- shouldn’t a manually unmatched item be created directly on Wikidata
(after all DBI is all about notable individuals who would easily pass Wikidata’s notability threshold for biographies)
- shouldn’t the relation between *Giulio (Cesare) Baldigara *(Q1010811
https://www.wikidata.org/wiki/Q1010811) and the newly created item for *Giulio Baldigara* be explicitly represented via a *not the same as* property, to prevent future humans or machines from accidentally remerging the two items based on some kind of heuristics
Thoughts welcome,
Dario
[1] https://twitter.com/ReaderMeter/status/667214565621432320 [2] https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse...
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Nov 21, 2015, at 9:44 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Yes you can add an item for the missing brother. When you do, you should link it to his brother and thereby they are explicitly not the same. They can both have the same alias. It helps when you add pertinent data like a date of birth/death. I take it they are not twins. Thanks, GerardM
Hi Gerard, I am actually interested in the general problem, not this specific pair. In other words: should Mix’n’match automatically perform the two actions I listed above? In other words, how can we clearly signal *in Wikidata* that the output of costly human labor should not be undone by machines or lazy humans in the future?
On 21 November 2015 at 18:34, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: I finally found the time to play extensively with Mix’n’match and it’s by far one of the most promising models I’ve come across for Wikidata growth. A short conversation with Magnus on Twitter got me thinking on how to best preserve the output of costly human curation.[1]
I spent most of my time manually auditing automatically matched entries from the Dizionario Biografico degli Italiani [2]. These entries are long, unstructured biographical entries and it takes quite a lot of effort to understand if the two individuals referenced by Wikidata and DBI actually are the same person. This is a great example of a task that’s still pretty hard for a machine to perform, no matter how sophisticated the algorithm.
My favorite example? Mix’n’ match suggested a match between Giulio Baldigara (Q1010811 https://www.wikidata.org/wiki/Q1010811) and Giulio Baldigara (DBI http://www.treccani.it/enciclopedia/giulio-baldigara_(Dizionario_Biografico)/) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is Giulio Cesare Baldigara (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
My question at the moment is: the output of a labor-intensive review of a potential match is currently stored as a volatile flag in a tool hosted on labs, but is invisible in Wikidata. Should something happen to Mix’n’match (god forbid) the result of my work would get lost. Which got me thinking:
- shouldn’t a manually unmatched item be created directly on Wikidata (after all DBI is all about notable individuals who would easily pass Wikidata’s notability threshold for biographies)
- shouldn’t the relation between Giulio (Cesare) Baldigara (Q1010811 https://www.wikidata.org/wiki/Q1010811) and the newly created item for Giulio Baldigara be explicitly represented via a not the same as property, to prevent future humans or machines from accidentally remerging the two items based on some kind of heuristics
Thoughts welcome,
Dario
[1] https://twitter.com/ReaderMeter/status/667214565621432320 https://twitter.com/ReaderMeter/status/667214565621432320 [2] https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse... https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offset=0&show_noq=0&show_autoq=1&show_userq=0&show_na=0
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Hoi, What you are talking about is a workflow that is much more involved than anything we currently automate.
The first thing is that you define a project. You assume that everyone mentioned in the Dizionario Biografico degli Italiani is notable enough to have a Wikidata item. The second thing you do is mix and match the people in this book against the people in Wikidata. For the ones you cannot match you want to create new items. The third thing you do is add statements both for all the people that do not exist in Wikidata. As a consequence you will add the link between the two brothers. You will make sure that both are known as architects.
The creation of new items is done on the basis of those people in the book that do not have a Wikidata item yet. You may find after some time that you missed people that did have a Wikidata item after all, they are then merged. Ideally there is a tool that allows easy addition of sources to statements that can be sourced to the book.
In general, much of this can be done already. Much of this will need to be done by hand. Much of this needs more documentation if it is to be a tool that can be done by more than just a few. Thanks, GerardM
On 21 November 2015 at 18:50, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
On Nov 21, 2015, at 9:44 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Yes you can add an item for the missing brother. When you do, you should link it to his brother and thereby they are explicitly not the same. They can both have the same alias. It helps when you add pertinent data like a date of birth/death. I take it they are not twins. Thanks, GerardM
Hi Gerard, I am actually interested in the general problem, not this specific pair. In other words: should Mix’n’match automatically perform the two actions I listed above? In other words, how can we clearly signal *in Wikidata* that the output of costly human labor should not be undone by machines or lazy humans in the future?
On 21 November 2015 at 18:34, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
I finally found the time to play extensively with Mix’n’match and it’s by far one of the most promising models I’ve come across for Wikidata growth. A short conversation with Magnus on Twitter got me thinking on how to best preserve the output of costly human curation.[1]
I spent most of my time manually auditing automatically matched entries from the Dizionario Biografico degli Italiani [2]. These entries are long, unstructured biographical entries and it takes quite a lot of effort to understand if the two individuals referenced by Wikidata and DBI actually are the same person. This is a great example of a task that’s still pretty hard for a machine to perform, no matter how sophisticated the algorithm.
My favorite example? Mix’n’ match suggested a match between *Giulio Baldigara *(Q1010811 https://www.wikidata.org/wiki/Q1010811) and *Giulio Baldigara* (DBI http://www.treccani.it/enciclopedia/giulio-baldigara_(Dizionario_Biografico)/) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is *Giulio Cesare Baldigara* (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
My question at the moment is: the output of a labor-intensive review of a potential match is currently stored as a volatile flag in a tool hosted on labs, but is invisible in Wikidata. Should something happen to Mix’n’match (god forbid) the result of my work would get lost. Which got me thinking:
- shouldn’t a manually unmatched item be created directly on Wikidata
(after all DBI is all about notable individuals who would easily pass Wikidata’s notability threshold for biographies)
- shouldn’t the relation between *Giulio (Cesare) Baldigara *(Q1010811
https://www.wikidata.org/wiki/Q1010811) and the newly created item for *Giulio Baldigara* be explicitly represented via a *not the same as* property, to prevent future humans or machines from accidentally remerging the two items based on some kind of heuristics
Thoughts welcome,
Dario
[1] https://twitter.com/ReaderMeter/status/667214565621432320 [2] https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse...
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Dario,
Op 21-11-2015 om 18:34 schreef Dario Taraborelli:
- shouldn’t a manually unmatched item be created directly on Wikidata
(after all DBI is all about notable individuals who would easily pass Wikidata’s notability threshold for biographies)
If the person in question is notable, you should create an item.
- shouldn’t the relation between /Giulio (Cesare) Baldigara /(Q1010811
https://www.wikidata.org/wiki/Q1010811) and the newly created item for /Giulio Baldigara/ be explicitly represented via a /not the same as/ property, to prevent future humans or machines from accidentally remerging the two items based on some kind of heuristics
You can use P1889: "different from" (https://www.wikidata.org/wiki/Property:P1889)
Maarten
To address the first point: So the auto-matches are just simple label-mmatches. Removing the automatch in mix'n'match just says that this was not the same person etc. and the entry is moved back to the "unmatched" pool.
This does /not/ mean there isn't a match on Wikidata! You only say that by setting the entry to "not on Wikidata". And I do occasionally batch-create items for those, usually when all entries are processed. Which can have other issues, like an item was created in the meantime, and now I create a duplicate.
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
As for the second point, I think in most cases the mere existence of a new, better-fitting item (or at least one equally fitting at first glance) will prevent false assignments. Sure, there are some cases, like the one given as an example, which would profit from a P1889 "different from" statement. We have run into that problem with the "merge game" I'm running, where people do a lot of false merges because the items seem identical at first glance.
However, I don't think this is prevalent enough to warrant special treatment in mix'n'match itself. For the few cases were it would help, Wikidata can always be edited manually. Besides, where would we draw the line? "John Smith" returns hundreds of search results; that would translate into tens of thousands of "different from" statements.
I think once your "Giulio Baldigara" example brother is created, and both will show up in search results, that alone will be enough to serve as a "different from" warning in most settings. Mix'n'match automatch, for example, will only match entries where the exact label is unique in labels and aliases; two items with a "Giulio Baldigara" label or alias would not automatch an entry with that name.
On Sat, Nov 21, 2015 at 5:35 PM Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
I finally found the time to play extensively with Mix’n’match and it’s by far one of the most promising models I’ve come across for Wikidata growth. A short conversation with Magnus on Twitter got me thinking on how to best preserve the output of costly human curation.[1]
I spent most of my time manually auditing automatically matched entries from the Dizionario Biografico degli Italiani [2]. These entries are long, unstructured biographical entries and it takes quite a lot of effort to understand if the two individuals referenced by Wikidata and DBI actually are the same person. This is a great example of a task that’s still pretty hard for a machine to perform, no matter how sophisticated the algorithm.
My favorite example? Mix’n’ match suggested a match between *Giulio Baldigara *(Q1010811 https://www.wikidata.org/wiki/Q1010811) and *Giulio Baldigara* (DBI http://www.treccani.it/enciclopedia/giulio-baldigara_(Dizionario_Biografico)/) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is *Giulio Cesare Baldigara* (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
My question at the moment is: the output of a labor-intensive review of a potential match is currently stored as a volatile flag in a tool hosted on labs, but is invisible in Wikidata. Should something happen to Mix’n’match (god forbid) the result of my work would get lost. Which got me thinking:
- shouldn’t a manually unmatched item be created directly on Wikidata
(after all DBI is all about notable individuals who would easily pass Wikidata’s notability threshold for biographies)
- shouldn’t the relation between *Giulio (Cesare) Baldigara *(Q1010811
https://www.wikidata.org/wiki/Q1010811) and the newly created item for *Giulio Baldigara* be explicitly represented via a *not the same as* property, to prevent future humans or machines from accidentally remerging the two items based on some kind of heuristics
Thoughts welcome,
Dario
[1] https://twitter.com/ReaderMeter/status/667214565621432320 [2] https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse...
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
To address the first point: So the auto-matches are just simple label-mmatches. Removing the automatch in mix'n'match just says that this was not the same person etc. and the entry is moved back to the "unmatched" pool.
This does /not/ mean there isn't a match on Wikidata! You only say that by setting the entry to "not on Wikidata".
Apologies, I was indeed referring to items explicitly flagged as "not on WD", not simply unmerged ones.
And I do occasionally batch-create items for those, usually when all entries are processed. Which can have other issues, like an item was created in the meantime, and now I create a duplicate.
+1
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
As for the second point, I think in most cases the mere existence of a new, better-fitting item (or at least one equally fitting at first glance) will prevent false assignments. Sure, there are some cases, like the one given as an example, which would profit from a P1889 "different from" statement. We have run into that problem with the "merge game" I'm running, where people do a lot of false merges because the items seem identical at first glance.
However, I don't think this is prevalent enough to warrant special treatment in mix'n'match itself. For the few cases were it would help, Wikidata can always be edited manually. Besides, where would we draw the line? "John Smith" returns hundreds of search results; that would translate into tens of thousands of "different from" statements.
I think once your "Giulio Baldigara" example brother is created, and both will show up in search results, that alone will be enough to serve as a "different from" warning in most settings. Mix'n'match automatch, for example, will only match entries where the exact label is unique in labels and aliases; two items with a "Giulio Baldigara" label or alias would not automatch an entry with that name.
These are valid concerns, happy to withdraw the second part of the proposal. Thanks Maarten for pointing me to the right property.
On Sat, Nov 21, 2015 at 5:35 PM Dario Taraborelli dtaraborelli@wikimedia.org wrote:
I finally found the time to play extensively with Mix’n’match and it’s by far one of the most promising models I’ve come across for Wikidata growth. A short conversation with Magnus on Twitter got me thinking on how to best preserve the output of costly human curation.[1]
I spent most of my time manually auditing automatically matched entries from the Dizionario Biografico degli Italiani [2]. These entries are long, unstructured biographical entries and it takes quite a lot of effort to understand if the two individuals referenced by Wikidata and DBI actually are the same person. This is a great example of a task that’s still pretty hard for a machine to perform, no matter how sophisticated the algorithm.
My favorite example? Mix’n’ match suggested a match between Giulio Baldigara (Q1010811) and Giulio Baldigara (DBI) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is Giulio Cesare Baldigara (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
My question at the moment is: the output of a labor-intensive review of a potential match is currently stored as a volatile flag in a tool hosted on labs, but is invisible in Wikidata. Should something happen to Mix’n’match (god forbid) the result of my work would get lost. Which got me thinking:
- shouldn’t a manually unmatched item be created directly on Wikidata (after all DBI is all about notable individuals who would easily pass Wikidata’s notability threshold for biographies)
- shouldn’t the relation between Giulio (Cesare) Baldigara (Q1010811) and the newly created item for Giulio Baldigara be explicitly represented via a not the same as property, to prevent future humans or machines from accidentally remerging the two items based on some kind of heuristics
Thoughts welcome,
Dario
[1] https://twitter.com/ReaderMeter/status/667214565621432320 [2] https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse...
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a
"create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Magnus, is the change live yet? I unmatched Giuseppe Civran (Q3770329 https://www.wikidata.org/wiki/Q3770329) and Giuseppe Civran (DBI http://www.treccani.it/enciclopedia/giuseppe-civran_(Dizionario_Biografico)/) and flagged the latter as “Not on Wikidata”, but no new item was created.
I am starting from this view of Mix’n’Match: https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse... https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offset=0&show_noq=0&show_autoq=1&show_userq=0&show_na=0#the_start
On Nov 23, 2015, at 12:05 PM, Magnus Manske magnusmanske@googlemail.com wrote:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abartov@wikimedia.org mailto:abartov@wikimedia.org> wrote: On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: On Nov 21, 2015, at 10:31, Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org https://donate.wikimedia.org/_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Hi Dario,
Looks like it's working now. Try clicking "remove match" on the "not in Wikidata" DBI item - it should now offer a link to create a new item rather than mark it as not-on-Wikidata.
I've just done it with one of the Dictionary of Ulster Biography items, and it produces this stub ready for fleshing out -
https://www.wikidata.org/w/index.php?title=Q21540032&oldid=276086447
Andrew.
On 24 November 2015 at 04:50, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
Magnus, is the change live yet? I unmatched Giuseppe Civran (Q3770329) and Giuseppe Civran (DBI) and flagged the latter as “Not on Wikidata”, but no new item was created.
I am starting from this view of Mix’n’Match: https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse...
On Nov 23, 2015, at 12:05 PM, Magnus Manske magnusmanske@googlemail.com wrote:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
It should show "New item" instead of "Not on Wikidata" where it works. You may have to force-reload mix'n'match once to get the new version.
Note that it will still say "Not on Wikidata", without item creation, for catalogs that do not have a Wikidata property.
On Tue, Nov 24, 2015 at 12:47 PM Andrew Gray andrew.gray@dunelm.org.uk wrote:
Hi Dario,
Looks like it's working now. Try clicking "remove match" on the "not in Wikidata" DBI item - it should now offer a link to create a new item rather than mark it as not-on-Wikidata.
I've just done it with one of the Dictionary of Ulster Biography items, and it produces this stub ready for fleshing out -
https://www.wikidata.org/w/index.php?title=Q21540032&oldid=276086447
Andrew.
On 24 November 2015 at 04:50, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
Magnus, is the change live yet? I unmatched Giuseppe Civran (Q3770329)
and
Giuseppe Civran (DBI) and flagged the latter as “Not on Wikidata”, but no new item was created.
I am starting from this view of Mix’n’Match:
https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offse...
On Nov 23, 2015, at 12:05 PM, Magnus Manske <magnusmanske@googlemail.com
wrote:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org
wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link)
to a
"create new item" button. The new item would have a label, a
description
(maybe), a statement with the catalog ID (if there is an associated
WIkidata
property!), and "instance of:human" if the entry is internally marked
as
"person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I
think
it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation
Imagine a world in which every single human being can freely share in
the
sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
--
- Andrew Gray andrew.gray@dunelm.org.uk
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
<3
L. Il 23/nov/2015 21:05, "Magnus Manske" magnusmanske@googlemail.com ha scritto:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a
"create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of different from (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example Grasulfo (Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> different from (P1889 https://www.wikidata.org/wiki/Property:P1889) <—> Grasulfo (Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli martinelliluca@gmail.com wrote:
<3
L.
Il 23/nov/2015 21:05, "Magnus Manske" <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> ha scritto: Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abartov@wikimedia.org mailto:abartov@wikimedia.org> wrote: On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: On Nov 21, 2015, at 10:31, Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org https://donate.wikimedia.org/_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Hoi, Why not use Reasonator? https://tools.wmflabs.org/reasonator/?find=Grasulfo Thanks, GerardM
On 27 November 2015 at 19:26, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of *different from* (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example *Grasulfo* (Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> *different from* (P1889 https://www.wikidata.org/wiki/Property:P1889) <—> *Grasulfo *(Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli martinelliluca@gmail.com wrote:
<3
L. Il 23/nov/2015 21:05, "Magnus Manske" magnusmanske@googlemail.com ha scritto:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link) to
a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Gerard – I think you’re missing my point. I’m not suggesting this as a display feature (which would be welcome and can always be generated by any tool querying Wikidata labels) but as a contribution stored to avoid future errors.
On Nov 27, 2015, at 10:29 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Why not use Reasonator? https://tools.wmflabs.org/reasonator/?find=Grasulfo https://tools.wmflabs.org/reasonator/?find=Grasulfo Thanks, GerardM
On 27 November 2015 at 19:26, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of different from (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example Grasulfo (Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> different from (P1889 https://www.wikidata.org/wiki/Property:P1889) <—> Grasulfo (Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli <martinelliluca@gmail.com mailto:martinelliluca@gmail.com> wrote:
<3
L.
Il 23/nov/2015 21:05, "Magnus Manske" <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> ha scritto: Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abartov@wikimedia.org mailto:abartov@wikimedia.org> wrote: On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: On Nov 21, 2015, at 10:31, Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org https://donate.wikimedia.org/_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Hoi, It is highly likely that your Lombard duke already existed. So I think you got it wrong. Thanks, GerardM
On 27 November 2015 at 19:31, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
Gerard – I think you’re missing my point. I’m not suggesting this as a display feature (which would be welcome and can always be generated by any tool querying Wikidata labels) but as a contribution *stored* to avoid future errors.
On Nov 27, 2015, at 10:29 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Why not use Reasonator? https://tools.wmflabs.org/reasonator/?find=Grasulfo Thanks, GerardM
On 27 November 2015 at 19:26, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of *different from* (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example *Grasulfo* ( Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> *different from* ( P1889 https://www.wikidata.org/wiki/Property:P1889) <—> *Grasulfo *( Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli martinelliluca@gmail.com wrote:
<3
L. Il 23/nov/2015 21:05, "Magnus Manske" magnusmanske@googlemail.com ha scritto:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link) to
a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
err…point me to the correct item or fix it then? WP:BOLD
On Nov 27, 2015, at 10:33 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, It is highly likely that your Lombard duke already existed. So I think you got it wrong. Thanks, GerardM
On 27 November 2015 at 19:31, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: Gerard – I think you’re missing my point. I’m not suggesting this as a display feature (which would be welcome and can always be generated by any tool querying Wikidata labels) but as a contribution stored to avoid future errors.
On Nov 27, 2015, at 10:29 AM, Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com> wrote:
Hoi, Why not use Reasonator? https://tools.wmflabs.org/reasonator/?find=Grasulfo https://tools.wmflabs.org/reasonator/?find=Grasulfo Thanks, GerardM
On 27 November 2015 at 19:26, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of different from (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example Grasulfo (Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> different from (P1889 https://www.wikidata.org/wiki/Property:P1889) <—> Grasulfo (Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli <martinelliluca@gmail.com mailto:martinelliluca@gmail.com> wrote:
<3
L.
Il 23/nov/2015 21:05, "Magnus Manske" <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> ha scritto: Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abartov@wikimedia.org mailto:abartov@wikimedia.org> wrote: On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: On Nov 21, 2015, at 10:31, Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org https://donate.wikimedia.org/_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Hoi, I do not know how to as there are two candidates. I do not have your book that helps pick the right one. <grin> I have added some statements so that disambiguation is even easier. Reasonator is a great tool :) Thanks, GerardM
On 27 November 2015 at 19:35, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
err…point me to the correct item or fix it then? WP:BOLD
On Nov 27, 2015, at 10:33 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, It is highly likely that your Lombard duke already existed. So I think you got it wrong. Thanks, GerardM
On 27 November 2015 at 19:31, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Gerard – I think you’re missing my point. I’m not suggesting this as a display feature (which would be welcome and can always be generated by any tool querying Wikidata labels) but as a contribution *stored* to avoid future errors.
On Nov 27, 2015, at 10:29 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Why not use Reasonator? https://tools.wmflabs.org/reasonator/?find=Grasulfo Thanks, GerardM
On 27 November 2015 at 19:26, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of *different from* (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example *Grasulfo* ( Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> *different from* (P1889 https://www.wikidata.org/wiki/Property:P1889) <—> *Grasulfo *( Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli martinelliluca@gmail.com wrote:
<3
L. Il 23/nov/2015 21:05, "Magnus Manske" magnusmanske@googlemail.com ha scritto:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
On Nov 21, 2015, at 10:31, Magnus Manske magnusmanske@googlemail.com wrote:
A soultion could be to change the "not on Wikidata" button (or link)
to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
oh I see, what a mess those Grisulfs, the family relationships are totally messed up, off to clean them up.
On Nov 27, 2015, at 10:38 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, I do not know how to as there are two candidates. I do not have your book that helps pick the right one. <grin> I have added some statements so that disambiguation is even easier. Reasonator is a great tool :) Thanks, GerardM
On 27 November 2015 at 19:35, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: err…point me to the correct item or fix it then? WP:BOLD
On Nov 27, 2015, at 10:33 AM, Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com> wrote:
Hoi, It is highly likely that your Lombard duke already existed. So I think you got it wrong. Thanks, GerardM
On 27 November 2015 at 19:31, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: Gerard – I think you’re missing my point. I’m not suggesting this as a display feature (which would be welcome and can always be generated by any tool querying Wikidata labels) but as a contribution stored to avoid future errors.
On Nov 27, 2015, at 10:29 AM, Gerard Meijssen <gerard.meijssen@gmail.com mailto:gerard.meijssen@gmail.com> wrote:
Hoi, Why not use Reasonator? https://tools.wmflabs.org/reasonator/?find=Grasulfo https://tools.wmflabs.org/reasonator/?find=Grasulfo Thanks, GerardM
On 27 November 2015 at 19:26, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of different from (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example Grasulfo (Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> different from (P1889 https://www.wikidata.org/wiki/Property:P1889) <—> Grasulfo (Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli <martinelliluca@gmail.com mailto:martinelliluca@gmail.com> wrote:
<3
L.
Il 23/nov/2015 21:05, "Magnus Manske" <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> ha scritto: Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov <abartov@wikimedia.org mailto:abartov@wikimedia.org> wrote: On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli <dtaraborelli@wikimedia.org mailto:dtaraborelli@wikimedia.org> wrote: On Nov 21, 2015, at 10:31, Magnus Manske <magnusmanske@googlemail.com mailto:magnusmanske@googlemail.com> wrote:
A soultion could be to change the "not on Wikidata" button (or link) to a "create new item" button. The new item would have a label, a description (maybe), a statement with the catalog ID (if there is an associated WIkidata property!), and "instance of:human" if the entry is internally marked as "person", but nothing else.
Would that be welcomed by "mix'n'matchers", and Wikidata people? I think it would make sense, for catalogs with a Wikidata property at least.
I would strongly support this, with the restrictions you suggest.
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org https://donate.wikimedia.org/_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org mailto:Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli Head of Research, Wikimedia Foundation wikimediafoundation.org http://wikimediafoundation.org/ • nitens.org http://nitens.org/ • @readermeter http://twitter.com/readermeter
Hoi, I have merged your Grasulf with the one that existed. I did it based on the documentation you provided a source for :) Thanks, GerardM
On 27 November 2015 at 19:41, Dario Taraborelli dtaraborelli@wikimedia.org wrote:
oh I see, what a mess those Grisulfs, the family relationships are totally messed up, off to clean them up.
On Nov 27, 2015, at 10:38 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, I do not know how to as there are two candidates. I do not have your book that helps pick the right one. <grin> I have added some statements so that disambiguation is even easier. Reasonator is a great tool :) Thanks, GerardM
On 27 November 2015 at 19:35, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
err…point me to the correct item or fix it then? WP:BOLD
On Nov 27, 2015, at 10:33 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, It is highly likely that your Lombard duke already existed. So I think you got it wrong. Thanks, GerardM
On 27 November 2015 at 19:31, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Gerard – I think you’re missing my point. I’m not suggesting this as a display feature (which would be welcome and can always be generated by any tool querying Wikidata labels) but as a contribution *stored* to avoid future errors.
On Nov 27, 2015, at 10:29 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Why not use Reasonator? https://tools.wmflabs.org/reasonator/?find=Grasulfo Thanks, GerardM
On 27 November 2015 at 19:26, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
Magnus, this is fantastic and works as expected, thanks a lot.
One last note regarding the use of *different from* (P1889 https://www.wikidata.org/wiki/Property:P1889). While I agree with you that it would be overkill to generate all these relations for common homonyms, for new items created by Mix’n’match with the above tweak, where a single other notable individual was previously missing from Wikidata (and when no matching label can be found), it would be tremendously useful to automatically add a two-way relation (see for example *Grasulfo* ( Q3775839 https://www.wikidata.org/wiki/Q3775839) <—> *different from* (P1889 https://www.wikidata.org/wiki/Property:P1889) <—> *Grasulfo * (Q21571734 https://www.wikidata.org/wiki/Q21571734). Having this property added would save me 2 extra edits and permanently store disambiguation signal for future reference.
Thoughts?
On Nov 24, 2015, at 9:54 AM, Luca Martinelli martinelliluca@gmail.com wrote:
<3
L. Il 23/nov/2015 21:05, "Magnus Manske" magnusmanske@googlemail.com ha scritto:
Done.
On Mon, Nov 23, 2015 at 12:25 PM Asaf Bartov abartov@wikimedia.org wrote:
On Sat, Nov 21, 2015 at 10:45 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
> On Nov 21, 2015, at 10:31, Magnus Manske < > magnusmanske@googlemail.com> wrote: > A soultion could be to change the "not on Wikidata" button (or link) > to a "create new item" button. The new item would have a label, a > description (maybe), a statement with the catalog ID (if there is an > associated WIkidata property!), and "instance of:human" if the entry is > internally marked as "person", but nothing else. > > > Would that be welcomed by "mix'n'matchers", and Wikidata people? I > think it would make sense, for catalogs with a Wikidata property at least. > > > I would strongly support this, with the restrictions you suggest. >
+1. This would be good.
A.
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org/
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
*Dario Taraborelli *Head of Research, Wikimedia Foundation wikimediafoundation.org • nitens.org • @readermeter http://twitter.com/readermeter
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Nov 21, 2015 18:35, "Dario Taraborelli" dtaraborelli@wikimedia.org wrote:
My favorite example? Mix’n’ match suggested a match between Giulio
Baldigara (Q1010811) and Giulio Baldigara (DBI) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is Giulio Cesare Baldigara (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
Hi dario, an interesting example. How did you determine these two are different persons?
Rupert
On Nov 21, 2015, at 10:44, rupert THURNER rupert.thurner@gmail.com wrote:
On Nov 21, 2015 18:35, "Dario Taraborelli" dtaraborelli@wikimedia.org wrote:
My favorite example? Mix’n’ match suggested a match between Giulio Baldigara (Q1010811) and Giulio Baldigara (DBI) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is Giulio Cesare Baldigara (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
Hi dario, an interesting example. How did you determine these two are different persons?
Rupert
DBI separately references three brothers: Giulio, Giulio Cesare and Ottavio and the entry suggested by MixNMatch is about Giulio. The Wikidata item was created from the Hungarian article which clearly refers to Giulio Cesare, but the WD label was created as Giulio, which resulted in the false positive.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Dario Taraborelli, 21/11/2015 18:34:
I spent most of my time manually auditing automatically matched entries from the Dizionario Biografico degli Italiani [2].
Thank you! That's very useful. I did some thousands too. :)
My favorite example? Mix’n’ match suggested a match between /Giulio Baldigara /(Q1010811 https://www.wikidata.org/wiki/Q1010811) and /Giulio Baldigara/ (DBI http://www.treccani.it/enciclopedia/giulio-baldigara_%28Dizionario_Biografico%29/) which looked totally legitimate: these two individuals are both Italian architects from the 16th century with the same name, they were both born around the same years in the same city, they were both active in Hungary at the same time: strong indication that they are the same person, right? It turns out they are brothers and the full name of the person referenced in Wikidata is /Giulio Cesare Baldigara/ (the least known in a family of architects). I unmatched the suggestion and flagged the DBI entry as non existing in Wikidata.
Yes, this happens every now and then with Europeans that time, also with father and son having very same name and very same field of activity or even publications. Creating an item is good, as long as you have at least one piece of distinguishing information. The standard practice (at least on it.wiki) in such very ambiguous cases is to add a disambiguation page or note, even if the target article doesn't exist yet. In https://it.wikipedia.org/wiki/Antonio_Montanari I went the extra mile and also added more information, including a source: you can go into any level of detail, unlike on Wikidata. I encourage you to create a disambiguation page at https://it.wikipedia.org/wiki/Giulio_Baldigara with all the information you told us here (policy reference: https://it.wikipedia.org/wiki/Aiuto:Disambiguazione#Link_rossi ).
Nemo