On 27.08.2015 14:43, Svavar Kjarrval wrote:
So far from the other thread, the current need seems
to be for two types
of definitions:
1. How to interpret declarations depending on associated properties.
If I understand your explanations correctly, the first point is a very
specific case of inference, which is already thinking in terms of
"hierarchies" (of some property). I am asking: how do we even know that
some properties are supposed to be read as forming a "hierarchy". This
is one special case of a rule of inference that one might formulate.
Have a look at
https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning/Use_cases
for some more examples of what could be relevant inferences. As you can
see, only few of these cases have anything to do with hierarchies
(subclass of in particular), but one could easily come up with similar
rules to express that something should be propagated along a hierarchy
(in some cases).
2. Constraints (or suggestions) when interpreting
multiple items.
For me, a constraint is a rule that infers a Warning. It can follow a
similar pattern as the examples I gave, but instead of deriving a new
statement, it will derive that a human should better look at a
particular piece of our data to check if it is meaningful.
There is no huge theoretical challenge involved here, but a big
practical one. I expect that we will refine our rules once we encounter
cases where they do not yield the right result. If you look at the
examples I gave, they are all mostly based on how we choose to define
the meaning of our properties. This is different from our current
constraints that specify how thing *usually* are in the world. We can
have both (constraints that warn us of unusual situations and rules that
derive statements) based on similar technology, but different
considerations are relevant when defining these two types of things.
As for Stubbs, there is a strong and a weak rule involved:
* Strong: all mayors are persons (I assume now that this class
encompasses named animals, as suggested in earlier messages; if not,
then replace "person" by a suitable generalisation that does).
* Weak: most mayors are humans.
The strong version could probably be applied to derive new information,
without danger of "exceptions" -- it would be part of our
characterisation of what makes something a "person" in our view (or
whatever other class we pick there). The weak version should only be
used to find potential problems that humans might want to check.
Similar rules exist in many domains:
* Strong: All birds are animals (it's part of how we define "bird")
* Weak: All birds can fly (it's something we observe for actual birds,
but not part of the definition of what it means to be a bird).
I suggest we start by focussing on strong rules, since they make a big
contribution to documenting what we mean (by "person", by "bird",
etc.),
even before we have any tool support for acting on this information.
Cheers,
Markus
The first definition is used so the machine can know *if* the
declaration is up in the hierarchy or sideways. When interpreting the
item, the machine needs to know if the property implies that all
declarations of that item are inhereted. If we take some currently
living human as an example who has a Wikidata item and that human is
connected to an occupation via a property. The machine should know if it
should process the declarations of the occupation to apply them to the
human, in whole or partially. Then there are properties which don't
inheret, like if the human has a declared family member, the human
doesn't inherit the other family member's name or birthdate.
The other definition has the purpose of solving contradictions like in
my example of Stubbs. If we are realistic, it's not likely that a tree
structure with that much data is totally free of contradictions. So we
need to have some way of telling the machine that there are, or could
be, contradictions. One example of this to define that a certain
property can't be more than one of something (at any given time). For
simplification (not referring to the current data structure) is that a
human is a part of a certain species. If we were to define, in this
case, that any item can't be part of more than one species, then the
machine would detect a contradiction. In the specific example of Stubbs,
the machine would determine that cats and humans are two separate
species and there can be only one[1]. If we had a definition that the
declaration closer to the item in the specific link has precedence, then
the machine would solve it by determining that mayors are generally
humans but Stubbs being a cat is an exception to that rule.
[1] Didn't see the Highlander reference until I had written it.
- Svavar Kjarrval
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata