Regarding definitions:

Note that I said "Label + Description is identifying", not merely the label. I assume this to be true because even for your example of "Germany", the disambiguation page works with rather short descriptions of each disambiguated page [1]. So even that fuzzy concept that you gave an example seems to be sufficiently identifiable for the sake and mission of the Wikipedia community, which gives me reason to believe that the community can sort this out. I mean, they basically already had! 

Regarding the Kangoo / Kubistar example:

In Wikidata they would be represented as two pages, one for the Kubistar (which would link to the Danish and German page for the Kubistar), and one for the Kangoo (which would link to the 20 language versions of the Kangoo article, including a Danish and a German one). This is a rather simple example, which would be easily expressed with the exact matches that we suggest.

In Wikidata, the Wikipedia links are planned to be inverse functional - i.e., every Wikipedia article in a specific language can only be linked to from one single Wikidata article. Two Wikidata pages cannot claim the same Wikipedia article in a single language as their defining article.

I.e. in the Kubistar/Kangoo example there would be two Wikidata pages. One about the Kubistar, linking to de:Nissan_Kubistar and da:Nissan_Kubistar, and one about the Kangoo, linking to the 20 different Kangoo articles. The Wikidata page for Kubistar could not link to any of those Kangoo articles.

Please do not misunderstand, I am not categorically against nonexact matches or broader or narrower (or else I wouldn't be discussing). But I haven't seen examples yet that convince me that the additional complexity of broader/narrower or unexact is required. As I said before, if we can model more than 99% of all language links with the suggested simple solution, I am reluctant to make it more complicated for the remaining <1%.

Cheers,
Denny

P.S.: oh, yes, indeed! Thank you for this excellent and interesting discussion, it really does shed light on some of the aspects of the current draft of the data model, and will eventually improve it and sharpen the understanding of the model. 

[1] https://en.wikipedia.org/wiki/Germany_(disambiguation)



2012/4/5 Gregor Hagedorn <g.m.hagedorn@gmail.com>
On 5 April 2012 18:30, Denny Vrandečić <denny.vrandecic@wikimedia.de> wrote:
> The label and the description together are meant to be identifying.
>
> I.e. "Georgia - A country in central Asia", or "Frankfurt - A city in Hesse,
> Germany", etc.
>
> Additionally, the Wikipedia links provide quite some guidance to it.

I believe it will be difficult to craft labels that work as
definitions. A label is hinting, and may often be sufficiently precise
for the majority of purposes. If we speak of "Germany" it is very hard
to express in a simple string the different historical, geographical,
political delimitations that this term may carry.

In my own field of work even technical terms are often difficult to
resolve to a definition. In biology, the width of taxon delimitations
changes over time and with new research, and even technical terms in
morphologoy often have quite different meanings, depending on the
"school" that is being followed.

Or to cite a car example again: The label "Renault Kangoo" is
unspecific as to the version/revision/release of it, so technical data
that vary between these versions can not be added to it. However, the
de.wikipedia.org/wiki/Nissan_Kubistar is in most Wikipedias also
subsumed under "Renault Kangoo". So it is a valid assumption that when
labeling something "Renault Kangoo" it refers to both of these
identical models sold under different names. But then, the "Nissan
Kubistar" is only equivalent to the first version/revision/release of
the "Renault Kangoo"...

This is not unsolvable, but if you want to import or add data to an
element, it will be very hard to judge from a short label the correct
concept. I was hoping that linking this to Wikipedia articles would
help, but this will be hard if a Wikidata page is linked to 40
Wikipedias, any given Wikidata editor can read only a handful of, and
with no support to distinguish between exactMatch and closeMatch.

My suggestions is to allow a differentiation of exactMatch and
closeMatch and instruct editors to use at least one exact match, and
considers this or these the defining wikipedia pages, whereas other
are added as close match.

Of course, the label will remain useful to stumble of changes in
definition of width of concept over time, and correct those after
consulting the revision number to which the original links was formed
(not present, but perhaps achievable by some timestamping and
comparison?)

Gregor

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l



--
Project director Wikidata
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.