On fim 27.ágú 2015 15:52, Markus Krötzsch wrote:
On 27.08.2015 14:43, Svavar Kjarrval wrote:
So far from the other thread, the current need seems to be for two types of definitions:
- How to interpret declarations depending on associated properties.
If I understand your explanations correctly, the first point is a very specific case of inference, which is already thinking in terms of "hierarchies" (of some property). I am asking: how do we even know that some properties are supposed to be read as forming a "hierarchy". This is one special case of a rule of inference that one might formulate. Have a look at
https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning/Use_cases
for some more examples of what could be relevant inferences. As you can see, only few of these cases have anything to do with hierarchies (subclass of in particular), but one could easily come up with similar rules to express that something should be propagated along a hierarchy (in some cases).
- Constraints (or suggestions) when interpreting multiple items.
For me, a constraint is a rule that infers a Warning. It can follow a similar pattern as the examples I gave, but instead of deriving a new statement, it will derive that a human should better look at a particular piece of our data to check if it is meaningful.
There is no huge theoretical challenge involved here, but a big practical one. I expect that we will refine our rules once we encounter cases where they do not yield the right result. If you look at the examples I gave, they are all mostly based on how we choose to define the meaning of our properties. This is different from our current constraints that specify how thing *usually* are in the world. We can have both (constraints that warn us of unusual situations and rules that derive statements) based on similar technology, but different considerations are relevant when defining these two types of things.
As for Stubbs, there is a strong and a weak rule involved:
- Strong: all mayors are persons (I assume now that this class
encompasses named animals, as suggested in earlier messages; if not, then replace "person" by a suitable generalisation that does).
- Weak: most mayors are humans.
The strong version could probably be applied to derive new information, without danger of "exceptions" -- it would be part of our characterisation of what makes something a "person" in our view (or whatever other class we pick there). The weak version should only be used to find potential problems that humans might want to check.
Similar rules exist in many domains:
- Strong: All birds are animals (it's part of how we define "bird")
- Weak: All birds can fly (it's something we observe for actual birds,
but not part of the definition of what it means to be a bird).
I suggest we start by focussing on strong rules, since they make a big contribution to documenting what we mean (by "person", by "bird", etc.), even before we have any tool support for acting on this information.
Cheers,
Markus
I'm a big advocate of strong versions. My suggestions for "exceptions" was practical since we can't reasonably expect all data to be consistent with strong definitions. Personally I wouldn't support weak versions when a feasable strong version alternate would be available. The constraints I had in mind are only suggestive and would only serve as warnings, so I think we agree there. The constraints wouldn't be enforced but rather used to detect potential mistakes in the data. It wouldn't prevent someone adding the information that Stubbs is a mayor when it would lead to the contradiction of him being both a human and a cat.
Regarding your question of my former definition, the point is to serve as a classification of what can be reasonably inferred from the relationship of two items, depending on the property used to connect them. Like in the case of Stubbs. Stubbs is a mayor and from that connection we can (or should be able to) to assume Stubbs is also a public official, a head of government and a politician. However, we shouldn't reasonably be able to assume Stubb's Freebase identifier is the same as for the town. The purpose is to enable machines to retrieve an item and extract all the relevant facts which can be reasonable inferred based on the relationship of that item with other items, recursively, until all the branches are exhausted.
- Svavar Kjarrval