Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it. If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"? A practical example would be appreciated since it will clarify things, I suppose.
[1] https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
Truthy=simple, direct, only Subject-Predicate-Object structure
For example: wd:Q76127 wdt:P26 wd:Q468519 (= Sukarno hasSpouse Fatmawati)
All=contains not only the Truthy ones, but also the ones with qualifiers (= how long was the marriage? when did the marriage happen?), references (sources to support the claim), and preferences (in case of multiple values, one might be preferred -- think of multiple birth dates of some people).
-fariz
Regards, Fariz
On Sun, Dec 3, 2017 at 1:49 PM, Laura Morales lauretas@mail.com wrote:
Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it. If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"? A practical example would be appreciated since it will clarify things, I suppose.
[1] https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Additionally, and probably mostly as it's where they are taking there name from, truthy statements are statements with the best rank https://www.wikidata.org/wiki/Wikidata:Glossary#Rank on a given property: for instance, the Earth (Q2 https://www.wikidata.org/wiki/Q2) has 3 shapes (P1419 https://www.wikidata.org/wiki/Property:P1419) statements - oblate spheroid (Q3241540), geoid (Q185969), and ball (Q838611) - but while 2 (oblate spheroid, and ball) have a "normal" rank, 1 (geoid) was given a "preferred" rank. Thus, when considering truthy statments, the only value for Earth shape is geoid.
Hope that helps :)
Bests,
Maxime Lathuilière
Le 03/12/2017 à 07:54, Fariz Darari a écrit :
Truthy=simple, direct, only Subject-Predicate-Object structure
For example: wd:Q76127 wdt:P26 wd:Q468519 (= Sukarno hasSpouse Fatmawati)
All=contains not only the Truthy ones, but also the ones with qualifiers (= how long was the marriage? when did the marriage happen?), references (sources to support the claim), and preferences (in case of multiple values, one might be preferred -- think of multiple birth dates of some people).
-fariz
Regards, Fariz
On Sun, Dec 3, 2017 at 1:49 PM, Laura Morales <lauretas@mail.com mailto:lauretas@mail.com> wrote:
Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it. If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"? A practical example would be appreciated since it will clarify things, I suppose. [1] https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps <https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata>
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
All=contains not only the Truthy ones, but also the ones with qualifiers
imho: Sometimes Qualifiers is very important for multiple values ( like "Start time","End time","point in time", ... ) for example: Russia https://www.wikidata.org/wiki/Q159 : Russia - P38:"currency" has 2 "statements" both with qualifiers:
* Russian ruble - ( start time: 1992 ) * Soviet ruble - (end time: September 1993 )
My Question: in this case - what is the "Truthy=simple" result for Russia-P38:"currency" ?
Regards, Imre
2017-12-03 7:54 GMT+01:00 Fariz Darari fadirra@gmail.com:
Truthy=simple, direct, only Subject-Predicate-Object structure
For example: wd:Q76127 wdt:P26 wd:Q468519 (= Sukarno hasSpouse Fatmawati)
All=contains not only the Truthy ones, but also the ones with qualifiers (= how long was the marriage? when did the marriage happen?), references (sources to support the claim), and preferences (in case of multiple values, one might be preferred -- think of multiple birth dates of some people).
-fariz
Regards, Fariz
On Sun, Dec 3, 2017 at 1:49 PM, Laura Morales lauretas@mail.com wrote:
Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it. If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"? A practical example would be appreciated since it will clarify things, I suppose.
[1] https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Am 03.12.2017 um 14:49 schrieb Imre Samu:
All=contains not only the Truthy ones,but also the ones with qualifiers
imho: Sometimes Qualifiers is very important for multiple values ( like "Start time","End time","point in time", ... ) for example: Russia https://www.wikidata.org/wiki/Q159%C2%A0 : Russia - P38:"currency" has 2 "statements" both with qualifiers:
* Russian ruble - ( start time: 1992 ) * Soviet ruble - (end time: September 1993 )
My Question: in this case - what is the "Truthy=simple" result for Russia-P38:"currency" ?
You will simply get two truthy results: Russian rubel, and Soviet rubel. Both are Russian currencies. If you want to know when, why, where, etc, you have to check the qualified "full" statements.
That's why it's called "truthy": the answer is kind of true, depending on context.
Current state gives me one result, the Russian ruble, due to its preferred rank (notice the wdt prefix):
https://query.wikidata.org/#select%20%2a%0A%7B%20wd%3AQ159%20wdt%3AP38%20%3F...
On Dec 3, 2017 21:02, "Daniel Kinzler" daniel.kinzler@wikimedia.de wrote:
Am 03.12.2017 um 14:49 schrieb Imre Samu:
All=contains not only the Truthy ones,but also the ones with qualifiers
imho: Sometimes Qualifiers is very important for multiple values (
like
"Start time","End time","point in time", ... ) for example: Russia https://www.wikidata.org/wiki/Q159 : Russia -
P38:"currency"
has 2 "statements" both with qualifiers:
* Russian ruble - ( start time: 1992 ) * Soviet ruble - (end time: September 1993 )
My Question: in this case - what is the "Truthy=simple" result for
Russia-P38:"currency" ?
You will simply get two truthy results: Russian rubel, and Soviet rubel. Both are Russian currencies. If you want to know when, why, where, etc, you have to check the qualified "full" statements.
That's why it's called "truthy": the answer is kind of true, depending on context.
-- Daniel Kinzler Principal Platform Engineer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Am 03.12.2017 um 15:06 schrieb Fariz Darari:
Current state gives me one result, the Russian ruble, due to its preferred rank (notice the wdt prefix):
https://query.wikidata.org/#select%20%2a%0A%7B%20wd%3AQ159%20wdt%3AP38%20%3F...
Ah, right - the current answer would by convention be marked as preferred, so only it counts as "truthy". Sorry for the confusion.
In the RDF dump, used by the query service, totally. See https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format . In the json dump, all these is encoded by a "mainsnak" attribute and a "qualifiers" one : https://www.mediawiki.org/wiki/Wikibase/DataModel/JSON
2017-12-03 15:34 GMT+01:00 Laura Morales lauretas@mail.com:
If you want to know when, why, where, etc, you have to check the qualified "full" statements.
All these qualifiers are encoded as additional triples in "all", correct?
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
« Ranking » is a Wikibase feature to deal with this. If one of the statement is ranked « preferred », typically the one valid at present time, then it will be the only one present in a typical query result or in an infobox extraction.
2017-12-03 14:49 GMT+01:00 Imre Samu pella.samu@gmail.com:
All=contains not only the Truthy ones, but also the ones with qualifiers
imho: Sometimes Qualifiers is very important for multiple values ( like "Start time","End time","point in time", ... ) for example: Russia https://www.wikidata.org/wiki/Q159 : Russia - P38:"currency" has 2 "statements" both with qualifiers:
- Russian ruble - ( start time: 1992 )
- Soviet ruble - (end time: September 1993 )
My Question: in this case - what is the "Truthy=simple" result for Russia-P38:"currency" ?
Regards, Imre
2017-12-03 7:54 GMT+01:00 Fariz Darari fadirra@gmail.com:
Truthy=simple, direct, only Subject-Predicate-Object structure
For example: wd:Q76127 wdt:P26 wd:Q468519 (= Sukarno hasSpouse Fatmawati)
All=contains not only the Truthy ones, but also the ones with qualifiers (= how long was the marriage? when did the marriage happen?), references (sources to support the claim), and preferences (in case of multiple values, one might be preferred -- think of multiple birth dates of some people).
-fariz
Regards, Fariz
On Sun, Dec 3, 2017 at 1:49 PM, Laura Morales lauretas@mail.com wrote:
Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it. If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"? A practical example would be appreciated since it will clarify things, I suppose.
[1] https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
« Ranking » is a Wikibase feature to deal with this. If one of the
statement is ranked « preferred »,
typically the one valid at present time, then it will be the only one
present in a typical query result or in an infobox extraction.
Thank you :)
one more question: As a human: how can I check - for example Russia - "Truthy=simple" statements? Is there any website?
The https://www.wikidata.org/wiki/Q159 links show all statements, I want to see only the "Truthy=simple" statements. What is the best practice - for debugging?
Thanks in advance, Imre
2017-12-03 15:10 GMT+01:00 Thomas Douillard thomas.douillard@gmail.com:
« Ranking » is a Wikibase feature to deal with this. If one of the statement is ranked « preferred », typically the one valid at present time, then it will be the only one present in a typical query result or in an infobox extraction.
2017-12-03 14:49 GMT+01:00 Imre Samu pella.samu@gmail.com:
All=contains not only the Truthy ones, but also the ones with qualifiers
imho: Sometimes Qualifiers is very important for multiple values ( like "Start time","End time","point in time", ... ) for example: Russia https://www.wikidata.org/wiki/Q159 : Russia - P38:"currency" has 2 "statements" both with qualifiers:
- Russian ruble - ( start time: 1992 )
- Soviet ruble - (end time: September 1993 )
My Question: in this case - what is the "Truthy=simple" result for Russia-P38:"currency" ?
Regards, Imre
2017-12-03 7:54 GMT+01:00 Fariz Darari fadirra@gmail.com:
Truthy=simple, direct, only Subject-Predicate-Object structure
For example: wd:Q76127 wdt:P26 wd:Q468519 (= Sukarno hasSpouse Fatmawati)
All=contains not only the Truthy ones, but also the ones with qualifiers (= how long was the marriage? when did the marriage happen?), references (sources to support the claim), and preferences (in case of multiple values, one might be preferred -- think of multiple birth dates of some people).
-fariz
Regards, Fariz
On Sun, Dec 3, 2017 at 1:49 PM, Laura Morales lauretas@mail.com wrote:
Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it. If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"? A practical example would be appreciated since it will clarify things, I suppose.
[1] https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
I had the same question about “truthy” and why we typed in “wdt” in Wikidata Query in some places and “wd” in others.
Lukas Werkmeister gave this nice succinct explanation during our Wikicite training, that may be useful:
https://youtu.be/XLlzTtozRY4?t=2h23m10s
-Andrew
On Sun, Dec 3, 2017 at 1:49 AM, Laura Morales lauretas@mail.com wrote:
Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it. If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"? A practical example would be appreciated since it will clarify things, I suppose.
[1] https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi!
Can somebody please explain (in simple terms) what's the difference between "all" and "truthy" RDF dumps? I've read the explanation available on the wiki [1] but I still don't get it.
Technically "truthy" is the set of statements with best non-deprecated rank for the property. Semantically, it is the value you most likely expect as the answer to a simple question "what is X of Y", like "what is the population of London" or "who is the wife of the US president?"
If I'm just a user of the data, because I want to retrieve information about a particular item and link items with other graphs... what am I missing/leaving-out by using "truthy" instead of "all"?
Historical data - i.e. current population vs. all historic population figures, current spouse vs.all previous marriages, current head of state vs. list of all people occupying the office. Some other data, possibly, such as official name vs. alias (provided that is expressed as a property), commonly accepted value vs. alternative possibilities, etc.
A practical example would be appreciated since it will clarify things, I suppose.
Current (as in, latest/best available for now) population of London would be found as "truthy" value (wdt), all other population figures - e.g. historical figures - will be under "all" (p/ps/psv).