On 10/18/2015 01:59 PM, Stas Malyshev wrote:
[Emw]
Hi!
The community-defined meaning of /subclass of/ (P279) is that of rdfs:subClassOf [1]. Similarly, the community-defined meaning of /instance of/ (P31) is that of rdf:type [2, 3].
Are you sure [that] is always correct? AFAIK there are some specific rules and meanings in OWL that classes should adhere to, also same thing can not be an individual and a class, and others (not completely sure of the whole list, as I don't have enough background in RDF/OWL). But I'm not sure existing data actually follows that.
OWL does not currently allow classes to be directly treated as individuals. This is more of an engineering decision than a philosophical one, however. In RDFS classes are also individuals.
There are some open problems with how to handle qualifiers on /instance of/ and /subclass of/ in RDF/OWL exports of P31 as rdf:type and P279 as rdfs:subClassOf, but that does not negate the community's decision to tie its two most basic membership properties to those W3C standard properties. In the current RDF/OWL exports that follow the community
I'm not sure I understand how that works in practice. I.e., if we say that P31 *is* rdf:type, then it can't be qualified in RDF/OWL and we can not represent part (albeit small, qualified properties are about 0.2% of all such properties) of our data.
I mean, we can certainly have data sets which include P31 statements from the data translated to rdf:type unless they have qualifiers, and that can be very useful pragmatically, no question about it. But can we really say P31 is the same as rdf:type and use it whenever we choose to represent Wikidata data as RDF? I'm not sure about that.
Nor am I.
For example, pizza (https://www.wikidata.org/wiki/Q177) is currently modeled as an instance of food and (transitively) a subclass of food.
Here we have another practical issue - if we adhere to the strict notion that pizza is only a subclass, then we would practically never have any instances in the database for wide categories of things. I.e. since a particular food item is rarely notable enough to be featured in Wikidata, no food would have instances. It may be formally correct but I'm afraid it's not like most people think
- for most people, pizza is a food, not a "subclass of food".
Well pizza is a kind of food, and a kind that is important enough to get a name in some languages. I agree that it would be nice, however, to be able to model the way that we think that people think, and thus be able to make pizza an instance of some food class instead of requiring that it be (only) a subclass of some general class.
Same with chemistry - as virtually no actual physical chemical compound (as in "this brown liquid in my test tube I prepared this morning by mixing contents of those three other test tubes") of would be notable enough to gain entry in Wikidata, [nearly] nothing in chemistry would ever be an instance. Theoretically it may be sound, but practically I'm not sure it would work well, even more - that it is *already* what the consensus on Wikidata is.
I have come around to the position that it is preferrable to model these sort of domains using multiple levels of the class hierarchy. For food, there would be a class (possibly called food) whose instances are those things that are actually eaten (like the pizza I ate in Bethlehem last week). There would also be a class (possibly also called food, but maybe food type) whose instances are the (notable?) classes of food (like pizza, but maybe also like bad pizza from a hole-in-the-wall restaurant). This lets you have your cake and describe it too.
I have also come around to the position that this situation is very common. Also, people seem to be generally capable of working with such modelling, at least informally in their heads.
However, this modelling methodology needs to be described to users, as even things that people do well internally can cause problems when they are being externalized. For example, it would be a problem if users put things in the wrong place (pizza as an instance of the non-food-type food) or make other modelling errors. There also should be tool support, for exammple to ensure that all instances of the food-type food are subclasses of the non-food-type food (and maybe vice-versa).
But what else can be done? Every other approach that I have seen has what I consider to be worse problems.
Stas Malyshev smalyshev@wikimedia.org
Peter F. Patel-Schneider