Hi Antoine,
The main answer to your questions is that the data model of Wikidata defines a *data structure* not the *informal meaning* that this data structure has in an application context (that is: what we, humans, want to say when we enter it). I try to explain this a bit better below.
How the presence or absence of a qualifier contributes to the informal meaning of a statement is not something that is defined by Wikidata. Just like Wikidata does not define what the property "office held" means, it also does not define what it means if "office held" is used with additional qualifiers. This is entirely governed by the community who uses these structures to express something. Of course, the community tries to do this in a systematic, reasoned, and intuitive way. However, there will never be a general rule how to interpret an arbitrary quantifier.
In particular, it is not true that quantifiers are "statements about statements". First, to avoid confusion, I need to explain Wikidata's terminology. A "statement" in Wikidata comprises the whole data structure: main property and value, qualifiers, and references. The structure without the references (the thing that the references provide evidence for) is called a "claim". A claim thus contains a main property and value (or no value or unknown value) and zero or more qualifier properties with values. Every claim encodes something that is claimed about the subject of the page (the Wikidata entity), and the references given are supporting this claim (as a whole).
You already illustrated yourself how this is different from making "statements about statements": it would lead to confusion when several statements have the same main property-value but different qualifiers. This is also why our RDF export does not use the same resource for reifying statements with the same main property-value. Instead, we only share the same resource it two claims are completely identical (including qualifiers).
It is true that many qualifiers have a certain "meta" flavour, but this is not always the case. An interesting case that you might have seen is P161 ("cast member") that is used to denote the actors in a film. The typical qualifier there is P453 ("role"), used to name the role (character) that the person played in the film. If you look at this, this is more like a ternary relation hasActor(film,actor,role) than like a "meta-statement". Indeed, an n-ary relationship cannot in general be represented by a meta-statement about a binary relation, again for the same reasons that you gave in your email. In this view, one should maybe also think of a relationship usPresident(person,start date, end date) than of an annotated assertion usPresident(person). Wikidata is special in that qualifiers are optional, yet the modelling view of n-ary relations might be closer to the pragmatic truth, since it avoids any meta-statements (it also elegantly justifies why there are no meta-meta-statements, i.e., qualifiers on qualifiers).
Best regards,
Markus
On 31/10/13 11:39, Antoine Zimmermann wrote:
Hello,
I have a few questions about how statement qualifiers should be used.
First, my understanding of qualifiers is that they define statements about statements. So, if I have the statement:
Q17(Japan) P6(head of government) Q132345(Shinzō Abe)
with the qualifier:
P39(office held) Q274948(Prime Minister of Japan)
it means that the statement holds an office, right? It seems to me that this is incorrect and that this qualifier should in fact be a statement about Shinzō Abe. Can you confirm this?
Second, concerning temporal qualifiers: what does it mean that the "start" or "end" is "no value"? I can imagine two interpretations:
- the statement is true forever (a person is a dead person from the
moment of their death till the end of the universe) 2. (for end date) the statement is still true, we cannot predict when it's going to end.
For me, case number 2 should rather be marked as "unknown value" rather than "no value". But again, what does "unknown value" means in comparison to having no indicated value?
Third, what if a statement is temporarily true (say, X held office from T1 to T2) then becomes false and become true again (like X held same office from T3 to T4 with T3 > T2)? The situation exists for Q35171(Grover Cleveland) who has the following statement:
Q35171 P39(position held) Q11696(President of the United States of America)
with qualifiers, and a second occurrence of the same statement with different qualifiers. The wikidata user interface makes it clear that there are two occurrences of the statement with different qualifiers, but how does the wikidata data model allows me to distinguish between these two occurrences?
How do I know that:
P580(start date) "March 4 1885"
only applies to the first occurrence of the statement, while:
P580(start date) "March 4 1893"
only applies to the second occurrence of the statement? I could have a heuristic that says if two "start date"s are given, then assume that they are the starting points of two disjoint intervales. But can I always guarantee this?
Best, AZ