On 27.08.2015 14:43, Svavar Kjarrval wrote:
So far from the other thread, the current need seems to be for two types of definitions:
- How to interpret declarations depending on associated properties.
If I understand your explanations correctly, the first point is a very specific case of inference, which is already thinking in terms of "hierarchies" (of some property). I am asking: how do we even know that some properties are supposed to be read as forming a "hierarchy". This is one special case of a rule of inference that one might formulate. Have a look at
https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning/Use_cases
for some more examples of what could be relevant inferences. As you can see, only few of these cases have anything to do with hierarchies (subclass of in particular), but one could easily come up with similar rules to express that something should be propagated along a hierarchy (in some cases).
- Constraints (or suggestions) when interpreting multiple items.
For me, a constraint is a rule that infers a Warning. It can follow a similar pattern as the examples I gave, but instead of deriving a new statement, it will derive that a human should better look at a particular piece of our data to check if it is meaningful.
There is no huge theoretical challenge involved here, but a big practical one. I expect that we will refine our rules once we encounter cases where they do not yield the right result. If you look at the examples I gave, they are all mostly based on how we choose to define the meaning of our properties. This is different from our current constraints that specify how thing *usually* are in the world. We can have both (constraints that warn us of unusual situations and rules that derive statements) based on similar technology, but different considerations are relevant when defining these two types of things.
As for Stubbs, there is a strong and a weak rule involved: * Strong: all mayors are persons (I assume now that this class encompasses named animals, as suggested in earlier messages; if not, then replace "person" by a suitable generalisation that does). * Weak: most mayors are humans.
The strong version could probably be applied to derive new information, without danger of "exceptions" -- it would be part of our characterisation of what makes something a "person" in our view (or whatever other class we pick there). The weak version should only be used to find potential problems that humans might want to check.
Similar rules exist in many domains: * Strong: All birds are animals (it's part of how we define "bird") * Weak: All birds can fly (it's something we observe for actual birds, but not part of the definition of what it means to be a bird).
I suggest we start by focussing on strong rules, since they make a big contribution to documenting what we mean (by "person", by "bird", etc.), even before we have any tool support for acting on this information.
Cheers,
Markus
The first definition is used so the machine can know *if* the declaration is up in the hierarchy or sideways. When interpreting the item, the machine needs to know if the property implies that all declarations of that item are inhereted. If we take some currently living human as an example who has a Wikidata item and that human is connected to an occupation via a property. The machine should know if it should process the declarations of the occupation to apply them to the human, in whole or partially. Then there are properties which don't inheret, like if the human has a declared family member, the human doesn't inherit the other family member's name or birthdate.
The other definition has the purpose of solving contradictions like in my example of Stubbs. If we are realistic, it's not likely that a tree structure with that much data is totally free of contradictions. So we need to have some way of telling the machine that there are, or could be, contradictions. One example of this to define that a certain property can't be more than one of something (at any given time). For simplification (not referring to the current data structure) is that a human is a part of a certain species. If we were to define, in this case, that any item can't be part of more than one species, then the machine would detect a contradiction. In the specific example of Stubbs, the machine would determine that cats and humans are two separate species and there can be only one[1]. If we had a definition that the declaration closer to the item in the specific link has precedence, then the machine would solve it by determining that mayors are generally humans but Stubbs being a cat is an exception to that rule.
[1] Didn't see the Highlander reference until I had written it.
- Svavar Kjarrval
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata