As someone relatively new to Wikidata, I need to ask for some help understanding the following paragraph from the forwarded email:
Please note that *instance of* (P31) and *subclass of* (P279) are not valid values for *subproperty of* (P1647) claims, as described in the P1647 documentation [1]. For example, claims like "occupation *subproperty of* instance of" are invalid.
What specifically in the P1647 documentation [1] describes that *instance of* (P31) and *subclass of* (P279) are not valid values for *subproperty of* (P1647) claims?
[1]https://www.wikidata.org/wiki/Property:P1647
Thanks, James Weaver
On Sat, Jan 10, 2015, at 02:25 PM, Emw wrote:
Since it appears that the creation of *subproperty of* went unnoticed by many, I'd like to describe an important aspect of its proper use, and how that relates to classification.
Please note that *instance of* (P31) and *subclass of* (P279) are not valid values for *subproperty of* (P1647) claims, as described in the P1647 documentation [1]. For example, claims like "occupation *subproperty of* instance of" are invalid. The reasons for this are both technical and architectural.
On the technical side, *instance of, subclass of* and *subproperty of* are intended to be straightforwardly exportable as rdf:type, rdfs:subClassOf and rdfs:subPropertyOf. As described in *On the Properties of Metamodeling in OWL* [2], claims that use OWL's built-in vocabulary (e.g. rdf:type) as individuals make an ontology undecidable. If an ontology is undecidable, then queries are not guaranteed to terminate. This is a big deal. Decidability is a main goal of OWL 2 DL and a requirement in the more specialized profiles OWL 2 EL, OWL 2 RL and OWL 2 QL. Most Semantic Web ontologies aim to valid be in at least OWL 2 DL. So if Wikidata aims to be easily interoperable with the rest of the Semantic Web, we should aim to be valid in OWL 2 DL, and thus not make claims of the form "P *subproperty of* instance of (P31)" or "P *subproperty of* subclass of (P279)".
Avoiding such claims is also good design. There should be one -- and preferably only one -- obvious way to specify the type of an instance. Having a multitude of domain-specific "type" subproperties would promote an anti-pattern: using *instance of* as a catch-all property to make any statement under the sun that makes sense when connected with the phrase "is a".
Having a single "type" property for instances also fosters another best practice in Wikidata: asserted monohierarchy [3]. In other words, there should be only one explicit normal or preferred *instance of *or *subclass of* claim per item. Having an *instance of *claim and a *subclass of* claim on an item isn't necessarily bad (it's called "punning"), but having multiple *instance of* claims or multiple *subclass of* claims on an item is a bad smell. Items can typically satisfy a huge number of *instance of* claims, but should generally have only one such claim made explicitly in Wikidata.
For example, Coco Chanel (Q45661) can be said to be "*instance of* French person", "*instance of* fashion designer", "*instance of* female", etc. Instead of such catch-all use of *instance of*, Wikidata moves that knowledge into properties like *country of citizenship* (P27), *occupation* (P106) and *sex or gender* (P21). Coco Chanel has one explicit *instance of* value: human (Q5) -- a class that encapsulates essential features of the subject.
Most of Wikidata follows these general principles of classification. But a few domains of knowledge remain either somewhat of a mess, or organized but idiosyncratic. Items like the one for the German municipality of Aalen [4], with 7 *instance of* values -- several of them redundant -- exemplify the mess. With the deletion of domain-specific "type" properties like *type of administrative territorial entity* (P132) [5], we are on the right track. The solution is not to make such things subproperties of *instance of*, but rather to delete them and use *instance of* for one preferred class and put other values in other properties (note -- this may require new properties!).
The same applies for *subclass of*.
I encourage anyone interested in stuff like *subproperty of* to join the discussions ongoing at https://www.wikidata.org/wiki/Wikidata:Property_proposal/Property_metadata. The Wikidata community is currently discussing how we want to handle things like *domain* and *range* properties (e.g. should we use rdfs:domain or schema:DomainIncludes?) and whether we want to have an *inverse of* property (or delete all inverse properties). The outcome of these discussions will shape the interface between Wikidata and the rest of the Semantic Web.
Thanks, Eric
https://www.wikidata.org/wiki/User:Emw
1.https://www.wikidata.org/wiki/Property:P1647 2.Boris Motik (2007). On the Properties of Metamodeling in OWL.**https://www.cs.ox.ac.uk/boris.motik/pubs/motik07metamodeling-journal.pdf** *3. *Barry Smith, Werner Ceusters (2011). Ontological realism: A methodology for coordinated evolution of scientific ontologies. Section 1.8: Asserted monohierarchies. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3104413/#S9** 4.Aalen on Wikidata as of 2015-01-10. https://www.wikidata.org/w/index.php?title=Q3951&oldid=184247296#P31 5.https://www.wikidata.org/wiki/Wikidata:Requests_for_deletions/Archive/2014/P... _________________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l