On 27.08.2015 14:43, Svavar Kjarrval wrote:
So far from the other thread, the current need
seems to be for two types
of definitions:
1. How to interpret declarations depending on associated properties.
If I understand your explanations correctly, the first point is a very
specific case of inference, which is already thinking in terms of
"hierarchies" (of some property). I am asking: how do we even know
that some properties are supposed to be read as forming a "hierarchy".
This is one special case of a rule of inference that one might
formulate. Have a look at
https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning/Use_cases
for some more examples of what could be relevant inferences. As you
can see, only few of these cases have anything to do with hierarchies
(subclass of in particular), but one could easily come up with similar
rules to express that something should be propagated along a hierarchy
(in some cases).
2. Constraints (or suggestions) when interpreting
multiple items.
For me, a constraint is a rule that infers a Warning. It can follow a
similar pattern as the examples I gave, but instead of deriving a new
statement, it will derive that a human should better look at a
particular piece of our data to check if it is meaningful.
There is no huge theoretical challenge involved here, but a big
practical one. I expect that we will refine our rules once we
encounter cases where they do not yield the right result. If you look
at the examples I gave, they are all mostly based on how we choose to
define the meaning of our properties. This is different from our
current constraints that specify how thing *usually* are in the world.
We can have both (constraints that warn us of unusual situations and
rules that derive statements) based on similar technology, but
different considerations are relevant when defining these two types of
things.
As for Stubbs, there is a strong and a weak rule involved:
* Strong: all mayors are persons (I assume now that this class
encompasses named animals, as suggested in earlier messages; if not,
then replace "person" by a suitable generalisation that does).
* Weak: most mayors are humans.
The strong version could probably be applied to derive new
information, without danger of "exceptions" -- it would be part of our
characterisation of what makes something a "person" in our view (or
whatever other class we pick there). The weak version should only be
used to find potential problems that humans might want to check.
Similar rules exist in many domains:
* Strong: All birds are animals (it's part of how we define "bird")
* Weak: All birds can fly (it's something we observe for actual birds,
but not part of the definition of what it means to be a bird).
I suggest we start by focussing on strong rules, since they make a big
contribution to documenting what we mean (by "person", by "bird",
etc.), even before we have any tool support for acting on this
information.
Cheers,
Markus
I'm a big advocate of strong versions. My suggestions for "exceptions"
was practical since we can't reasonably expect all data to be consistent
with strong definitions. Personally I wouldn't support weak versions
when a feasable strong version alternate would be available. The
constraints I had in mind are only suggestive and would only serve as
warnings, so I think we agree there. The constraints wouldn't be
enforced but rather used to detect potential mistakes in the data. It
wouldn't prevent someone adding the information that Stubbs is a mayor
when it would lead to the contradiction of him being both a human and a cat.
Regarding your question of my former definition, the point is to serve
as a classification of what can be reasonably inferred from the
relationship of two items, depending on the property used to connect
them. Like in the case of Stubbs. Stubbs is a mayor and from that
connection we can (or should be able to) to assume Stubbs is also a
public official, a head of government and a politician. However, we
shouldn't reasonably be able to assume Stubb's Freebase identifier is
the same as for the town. The purpose is to enable machines to retrieve
an item and extract all the relevant facts which can be reasonable
inferred based on the relationship of that item with other items,
recursively, until all the branches are exhausted.
- Svavar Kjarrval