As someone relatively new to Wikidata, I need to ask for some help
understanding the following paragraph from the forwarded email:
Please note that *instance of* (P31) and *subclass of* (P279) are not
valid values for *subproperty of* (P1647) claims, as described in the
P1647 documentation [1]. For example, claims like "occupation
*subproperty of* instance of" are invalid.
What specifically in the P1647 documentation [1] describes that
*instance of* (P31) and *subclass of* (P279) are not valid values for
*subproperty of* (P1647) claims?
[
Since it appears that the creation of *subproperty of*
went unnoticed
by many, I'd like to describe an important aspect of its proper use,
and how that relates to classification.
Please note that *instance of* (P31) and *subclass of* (P279) are not
valid values for *subproperty of* (P1647) claims, as described in the
P1647 documentation [1]. For example, claims like "occupation
*subproperty of* instance of" are invalid. The reasons for this are
both technical and architectural.
On the technical side, *instance of, subclass of* and *subproperty of*
are intended to be straightforwardly exportable as rdf:type,
rdfs:subClassOf and rdfs:subPropertyOf. As described in *On the
Properties of Metamodeling in OWL* [2], claims that use OWL's built-in
vocabulary (e.g. rdf:type) as individuals make an ontology
undecidable. If an ontology is undecidable, then queries are not
guaranteed to terminate. This is a big deal. Decidability is a main
goal of OWL 2 DL and a requirement in the more specialized profiles
OWL 2 EL, OWL 2 RL and OWL 2 QL. Most Semantic Web ontologies aim to
valid be in at least OWL 2 DL. So if Wikidata aims to be easily
interoperable with the rest of the Semantic Web, we should aim to be
valid in OWL 2 DL, and thus not make claims of the form "P
*subproperty of* instance of (P31)" or "P *subproperty of* subclass of
(P279)".
Avoiding such claims is also good design. There should be one -- and
preferably only one -- obvious way to specify the type of an instance.
Having a multitude of domain-specific "type" subproperties would
promote an anti-pattern: using *instance of* as a catch-all property
to make any statement under the sun that makes sense when connected
with the phrase "is a".
Having a single "type" property for instances also fosters another
best practice in Wikidata: asserted monohierarchy [3]. In other words,
there should be only one explicit normal or preferred *instance of *or
*subclass of* claim per item. Having an *instance of *claim and a
*subclass of* claim on an item isn't necessarily bad (it's called
"punning"), but having multiple *instance of* claims or multiple
*subclass of* claims on an item is a bad smell. Items can typically
satisfy a huge number of *instance of* claims, but should generally
have only one such claim made explicitly in Wikidata.
For example, Coco Chanel (Q45661) can be said to be "*instance of*
French person", "*instance of* fashion designer", "*instance of*
female", etc. Instead of such catch-all use of *instance of*, Wikidata
moves that knowledge into properties like *country of citizenship*
(P27), *occupation* (P106) and *sex or gender* (P21). Coco Chanel has
one explicit *instance of* value: human (Q5) -- a class that
encapsulates essential features of the subject.
Most of Wikidata follows these general principles of classification.
But a few domains of knowledge remain either somewhat of a mess, or
organized but idiosyncratic. Items like the one for the German
municipality of Aalen [4], with 7 *instance of* values -- several of
them redundant -- exemplify the mess. With the deletion of
domain-specific "type" properties like *type of administrative
territorial entity* (P132) [5], we are on the right track. The
solution is not to make such things subproperties of *instance of*,
but rather to delete them and use *instance of* for one preferred
class and put other values in other properties (note -- this may
require new properties!).
The same applies for *subclass of*.
I encourage anyone interested in stuff like *subproperty of* to join
the discussions ongoing at
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Property_metadata.
The Wikidata community is currently discussing how we want to handle
things like *domain* and *range* properties (e.g. should we use
rdfs:domain or schema:DomainIncludes?) and whether we want to have an
*inverse of* property (or delete all inverse properties). The outcome
of these discussions will shape the interface between Wikidata and the
rest of the Semantic Web.
Thanks, Eric
https://www.wikidata.org/wiki/User:Emw
1.https://www.wikidata.org/wiki/Property:P1647
2.Boris Motik (2007). On the Properties of Metamodeling in
OWL.**https://www.cs.ox.ac.uk/boris.motik/pubs/motik07metamodeling-journal.pdf**
*3. *Barry Smith, Werner Ceusters (2011). Ontological realism: A
methodology for coordinated evolution of scientific ontologies.
Section 1.8: Asserted monohierarchies.
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3104413/#S9**
4.Aalen on Wikidata as of 2015-01-10.
https://www.wikidata.org/w/index.php?title=Q3951&oldid=184247296#P31
5.https://www.wikidata.org/wiki/Wikidata:Requests_for_deletions/Archive/201…
_________________________________________________
Wikidata-l mailing list Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l