Hoi,
Yes you can add an item for the missing brother. When you do, you should
link it to his brother and thereby they are explicitly not the same. They
can both have the same alias. It helps when you add pertinent data like a
date of birth/death. I take it they are not twins.
Thanks,
GerardM
On 21 November 2015 at 18:34, Dario Taraborelli <dtaraborelli(a)wikimedia.org>
wrote:
I finally found the time to play extensively with
Mix’n’match and it’s by
far one of the most promising models I’ve come across for Wikidata growth.
A short conversation with Magnus on Twitter got me thinking on how to best
preserve the output of costly human curation.[1]
I spent most of my time manually auditing automatically matched entries
from the Dizionario Biografico degli Italiani [2]. These entries are long,
unstructured biographical entries and it takes quite a lot of effort to
understand if the two individuals referenced by Wikidata and DBI actually
are the same person. This is a great example of a task that’s still pretty
hard for a machine to perform, no matter how sophisticated the algorithm.
My favorite example? Mix’n’ match suggested a match between *Giulio
Baldigara *(Q1010811 <https://www.wikidata.org/wiki/Q1010811>) and *Giulio
Baldigara* (DBI
<http://www.treccani.it/enciclopedia/giulio-baldigara_(Dizionario_Biografico)/>)
which looked totally legitimate: these two individuals are both Italian
architects from the 16th century with the same name, they were both born
around the same years in the same city, they were both active in Hungary at
the same time: strong indication that they are the same person, right? It
turns out they are brothers and the full name of the person referenced in
Wikidata is *Giulio Cesare Baldigara* (the least known in a family of
architects). I unmatched the suggestion and flagged the DBI entry as non
existing in Wikidata.
My question at the moment is: the output of a labor-intensive review of a
potential match is currently stored as a volatile flag in a tool hosted on
labs, but is invisible in Wikidata. Should something happen to Mix’n’match
(god forbid) the result of my work would get lost. Which got me thinking:
- shouldn’t a manually unmatched item be created directly on Wikidata
(after all DBI is all about notable individuals who would easily pass
Wikidata’s notability threshold for biographies)
- shouldn’t the relation between *Giulio (Cesare) Baldigara *(Q1010811
<https://www.wikidata.org/wiki/Q1010811>) and the newly created item for *Giulio
Baldigara* be explicitly represented via a *not the same as* property, to
prevent future humans or machines from accidentally remerging the two items
based on some kind of heuristics
Thoughts welcome,
Dario
[1]
https://twitter.com/ReaderMeter/status/667214565621432320
[2]
https://tools.wmflabs.org/mix-n-match/?mode=catalog&catalog=55&offs…
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata