Hi,

the discussion regarding the properties in Wikidata is very interesting. However I have doubts if leaving the country (P17) property as it is, is a good idea.
The difference between Dublin country: Ireland, and U2 country: Ireland, is not superficial, but in both cases we have to do with two different semantic relations.
When we look at the ACE relation types [1], we might conclude, that in the first case this is a part-whole relation and in the second: general-affiliation. The ACE classification scheme is very general and it was used successfully for validating many NLP systems and tasks, relation-extraction in particular.

The reason I am against conflating these relations is the same as the reason I am against not disambiguating concepts like "work as an activity" and "work as an artifact". The activity/artifact distinction is semantically so basic, that it feels very natural to make this distinction. Moreover we can draw conclusions based on that distinction, e.g. that the activity might be interrupted, while the artifact not, the artifact might be destroyed, while the activity not etc. So it's also hard for me to accept such ambiguity in the case of properties.
I think this will become more apparent with the adoption of subproperties, since it will be hard to identify a superproperty for the country property and even more apparent with systems that draw conclusions based on Wikidata. The semantics of properties might be broad, but should not be ambiguous.

Kind regards,
Aleksander Smywinski-Pohl


[1] http://www.itl.nist.gov/iad/mig/tests/ace/2008/doc/ace08-evalplan.v1.2d.pdf



---- Wł. Cz, 08 sty 2015 22:29:33 +0100 Markus Krötzsch<markus@semantic-mediawiki.org> napisał(a) ----
On 08.01.2015 21:29, Thad Guidry wrote:
> Hi Marcus!
>
> Yes, you and I are on the same page.

I do indeed get this impression ;-)

> Yes, I know about the
> Property-first view of WIkidata. No quibbles. But there is still an
> issue with Assumptions for "Country" P17 being used for an instance of
> Band...so let's clarify this...
>
> So, I guess things are fuzzy, because I do not jump to assumptions
> beyond the meaning of P17 as it states on its Property or Discussion
> page. And the difference is that you are mentally performing a few
> assumptions. It is hard to train a computer to read your mind and
> answer a question for you however. :) Its better to write those
> assumptions (meanings) down somewhere.
>
> Even for Freebase, we found that Properties had to be described very
> well for all to understand their meaning with minimal ambiguity, and I
> was a big proponent on Freebase to Google for better descriptions.
>
> So, question.... since you understand the meaning of "Country" on the U2
> page as you state... Can you please tell me that meaning...and then
> let's see if we can transfer your knowledge to better improve P17
> Property and its description.

For me this means "the band is associated with Ireland" (culturally,
personally, originally). No deeper meaning. In most cases, it will also
be true that the band has had its first public performance in this
country and that (all) the band members are of that nationality, but
this would be jumping to conclusions.

I fully agree with you that this is not very precise. My point is that
it is still useful to have such broad relationships recorded, for
example for browsing/filtering of data. Moreover, in this particular
case I have doubts that narrower properties such as "had first public
performance in" or "consists of band members that are nationals of"
would be what people are looking for. Your original proposal of "country
of origin" seems vague to me as well (what does this mean for a band?
when does a band "originate"? do they always have an official time and
place of being founded which is recorded somewhere? -- seems like we are
just simulating a level of precision that does not exist in "reality").

An "Irish band" is not a mathematical term with an exact definition, but
I can live with that, as long as we can find a reference that calls U2
an "Irish band" it is valid for me as data.

I completely agree with you that detailed descriptions are very helpful
-- you already need this to agree on what a "reference" for a claim is
(and what isn't). Yet, this is different from narrowing down the use of
a property to one special case. One can also allow for properties that
are intentionally broad and approximate as long as this is documented
clearly enough. In many cases, data with higher precision can be found
anyway (for example, Wikidata should record the nationality of band
members, and we could also capture the time and place of the first
public appearance and the nationality of the record label of the first
album etc.).

Best regards,

Markus

>
>
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>


_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l