+1 to this.
While property re-use is desirable in general, we need to make some basic
distinctions. The realm of geographic relations (a special kind of
part-whole relations) is particularly clear and specific, and it would be
worthwhile to distinguish them from more vague notions of associations
between objects.
This issue comes up much more now that we have "subproperty of": if used
with cities, "country" is a subproperty of "is in the administrative
territorial entity" (P131), but when used with bands, it is not. This is a
sure sign that things should be diversified here.
Cheers,
Markus
On 12.01.2015 19:18, apohllo(a)o2.pl wrote:
Hi,
the discussion regarding the properties in Wikidata is very interesting.
However I have doubts if leaving the country (P17) property as it is, is
a good idea.
The difference between Dublin country: Ireland, and U2 country: Ireland,
is not superficial, but in both cases we have to do with two different
semantic relations.
When we look at the ACE relation types [1], we might conclude, that in
the first case this is a part-whole relation and in the second:
general-affiliation. The ACE classification scheme is very general and
it was used successfully for validating many NLP systems and tasks,
relation-extraction in particular.
The reason I am against conflating these relations is the same as the
reason I am against not disambiguating concepts like "work as an
activity" and "work as an artifact". The activity/artifact distinction
is semantically so basic, that it feels very natural to make this
distinction. Moreover we can draw conclusions based on that distinction,
e.g. that the activity might be interrupted, while the artifact not, the
artifact might be destroyed, while the activity not etc. So it's also
hard for me to accept such ambiguity in the case of properties.
I think this will become more apparent with the adoption of
subproperties, since it will be hard to identify a superproperty for the
country property and even more apparent with systems that draw
conclusions based on Wikidata. The semantics of properties might be
broad, but should not be ambiguous.
Kind regards,
Aleksander Smywinski-Pohl
[1]
http://www.itl.nist.gov/iad/mig/tests/ace/2008/doc/ace08-
evalplan.v1.2d.pdf
---- Wł. Cz, 08 sty 2015 22:29:33 +0100 *Markus
Krötzsch<markus(a)semantic-mediawiki.org>* napisał(a) ----
On 08.01.2015 21:29, Thad Guidry wrote:
Hi Marcus!
Yes, you and I are on the same page.
I do indeed get this impression ;-)
Yes, I know about the
Property-first view of WIkidata. No quibbles. But there is still an
issue with Assumptions for "Country" P17 being used for an
instance
of
Band...so let's clarify this...
So, I guess things are fuzzy, because I do not jump to assumptions
beyond the meaning of P17 as it states on its Property or
Discussion
page. And the difference is that you are mentally
performing a few
assumptions. It is hard to train a computer to read your mind and
answer a question for you however. :) Its better to write those
assumptions (meanings) down somewhere.
Even for Freebase, we found that Properties had to be described
very
well for all to understand their meaning with
minimal ambiguity,
and I
was a big proponent on Freebase to Google for
better descriptions.
So, question.... since you understand the meaning of "Country" on
the U2
page as you state... Can you please tell me that
meaning...and then
let's see if we can transfer your knowledge to better improve P17
Property and its description.
For me this means "the band is associated with Ireland" (culturally,
personally, originally). No deeper meaning. In most cases, it will
also
be true that the band has had its first public performance in this
country and that (all) the band members are of that nationality, but
this would be jumping to conclusions.
I fully agree with you that this is not very precise. My point is that
it is still useful to have such broad relationships recorded, for
example for browsing/filtering of data. Moreover, in this particular
case I have doubts that narrower properties such as "had first public
performance in" or "consists of band members that are nationals of"
would be what people are looking for. Your original proposal of
"country
of origin" seems vague to me as well (what does this mean for a band?
when does a band "originate"? do they always have an official time and
place of being founded which is recorded somewhere? -- seems like we
are
just simulating a level of precision that does not exist in
"reality").
An "Irish band" is not a mathematical term with an exact definition,
but
I can live with that, as long as we can find a reference that calls U2
an "Irish band" it is valid for me as data.
I completely agree with you that detailed descriptions are very
helpful
-- you already need this to agree on what a "reference" for a claim is
(and what isn't). Yet, this is different from narrowing down the use
of
a property to one special case. One can also allow for properties that
are intentionally broad and approximate as long as this is documented
clearly enough. In many cases, data with higher precision can be found
anyway (for example, Wikidata should record the nationality of band
members, and we could also capture the time and place of the first
public appearance and the nationality of the record label of the first
album etc.).
Best regards,
Markus
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
<mailto:Wikidata-l@lists.wikimedia.org>
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org