A human is not a part of a species, it is an instance of a species :)
Contradiction management is a very interisting topic, and contradictions are inherent to Wikidata model. We can't expect everything is consistent considering Wikidata only reflects sources, and that 2 sources can disagree in an essentially inconsistent way.
We could expect however that several statements extracted from the same source should be consistent themselves, but it might be rare that we will have enough statements that will be sourced to draw useful inferences. This can lead to subproblems like computing the maximum set of consistent sources on a part of the graph or finding the sources that leads to contradiction when took together.
However, we already have qualifiers that marks a source in contradiction with another : "statement disputed by". We could assume that the sources involved are probably inconsistent with each other.
Or we could simply drop the consistency checks out of the inference way :) And leave it to the constraint system : if an inference draws a path that leads to constraint violation, then community will be notified. To avoid explosion, the scope of inerences could be limited (not trying to compute the transitive closure of the inferences rules application). We could use some sort of "partial consistency" notion, such as those used in constraint programming.
Thinking about it I can imagine constraint problems such as "considering an inference I deduced some way, is it fully consistent with the set of sources we have, or is there a set of sources that implies the inference is not true ?" -> Is the inference a tautology or is the infererence only satisfiable in a problem where each statements maps to a variable, the different sources are values for the domain of the variables, and the sources must be consistent wrt. what we know they says on Wikidata ?
2015-08-27 14:43 GMT+02:00 Svavar Kjarrval svavar@kjarrval.is:
So far from the other thread, the current need seems to be for two types of definitions:
- How to interpret declarations depending on associated properties.
- Constraints (or suggestions) when interpreting multiple items.
The first definition is used so the machine can know *if* the declaration is up in the hierarchy or sideways. When interpreting the item, the machine needs to know if the property implies that all declarations of that item are inhereted. If we take some currently living human as an example who has a Wikidata item and that human is connected to an occupation via a property. The machine should know if it should process the declarations of the occupation to apply them to the human, in whole or partially. Then there are properties which don't inheret, like if the human has a declared family member, the human doesn't inherit the other family member's name or birthdate.
The other definition has the purpose of solving contradictions like in my example of Stubbs. If we are realistic, it's not likely that a tree structure with that much data is totally free of contradictions. So we need to have some way of telling the machine that there are, or could be, contradictions. One example of this to define that a certain property can't be more than one of something (at any given time). For simplification (not referring to the current data structure) is that a human is a part of a certain species. If we were to define, in this case, that any item can't be part of more than one species, then the machine would detect a contradiction. In the specific example of Stubbs, the machine would determine that cats and humans are two separate species and there can be only one[1]. If we had a definition that the declaration closer to the item in the specific link has precedence, then the machine would solve it by determining that mayors are generally humans but Stubbs being a cat is an exception to that rule.
[1] Didn't see the Highlander reference until I had written it.
- Svavar Kjarrval
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata