Hi, how is the datetime value with precision of one year stored?
For example for birt date in https://www.wikidata.org/wiki/Q299687 fine grain value for "1700" is "1.01.1700"
But for population date field in https://www.wikidata.org/wiki/Q216 the fine grain value for "2014" is "30.11.2013" Which is kind of unexpected.
-- Raul
Raul,
How do you get to these 'fine grain values'? They are not in the HTML rendition at the URLs you quote, and when I go to the RDF for these concepts I don't see any date information. Is this a Wikidata secret?
I would have thought that the correct approach would be to encode these values as gYear, and just record the four-digit year.
Richard
On 31/08/2015 18:19, Raul Kern wrote:
Hi, how is the datetime value with precision of one year stored?
For example for birt date in https://www.wikidata.org/wiki/Q299687 fine grain value for "1700" is "1.01.1700"
But for population date field in https://www.wikidata.org/wiki/Q216 the fine grain value for "2014" is "30.11.2013" Which is kind of unexpected.
-- Raul
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Hi!
I would have thought that the correct approach would be to encode these values as gYear, and just record the four-digit year.
While we do have a ticket for that (https://phabricator.wikimedia.org/T92009) it's not that simple since many triple stores consider dateTime and gYear to be completely different types and as such some queries between them would not work.
On 01.09.2015 05:17, Stas Malyshev wrote:
Hi!
I would have thought that the correct approach would be to encode these values as gYear, and just record the four-digit year.
While we do have a ticket for that (https://phabricator.wikimedia.org/T92009) it's not that simple since many triple stores consider dateTime and gYear to be completely different types and as such some queries between them would not work.
I agree. Our original RDF exports in Wikidata Toolkit are still using gYear, but I am not sure that this is a practical approach. In particular, this does not solve the encoding of time precisions in RDF. It only introduces some special cases for year (and also for month and day), but it cannot be used to encode decades, centuries, etc.
My current view is that it would be better to encode the actual time point with maximal precision, and to keep the Wikidata precision information independently. This applies to the full encoding of time values (where you have a way to give the precision as a separate value).
For the simple encoding, where the task is to encode a Wikidata time in a single RDF literal, things like gYear would make sense. At least full precision times (with time of day!) would be rather misleading there.
In any case, when using full precision times for cases with limited precision, it would be good to create a time point for RDF based on a uniform rule. Easiest option that requires no calendar support: use the earliest second that is within the given interval. So "20th century" would always lead to the time point "1900-01-01T00:00:00". If this is not done, it will be very hard to query for all uses of "20th century" in the data.
Markus
On 01/09/2015 09:26, Markus Krötzsch wrote:
On 01.09.2015 05:17, Stas Malyshev wrote:
Hi!
I would have thought that the correct approach would be to encode these values as gYear, and just record the four-digit year.
While we do have a ticket for that (https://phabricator.wikimedia.org/T92009) it's not that simple since many triple stores consider dateTime and gYear to be completely different types and as such some queries between them would not work.
I agree. Our original RDF exports in Wikidata Toolkit are still using gYear, but I am not sure that this is a practical approach. In particular, this does not solve the encoding of time precisions in RDF. It only introduces some special cases for year (and also for month and day), but it cannot be used to encode decades, centuries, etc.
My current view is that it would be better to encode the actual time point with maximal precision, and to keep the Wikidata precision information independently. This applies to the full encoding of time values (where you have a way to give the precision as a separate value).
For the simple encoding, where the task is to encode a Wikidata time in a single RDF literal, things like gYear would make sense. At least full precision times (with time of day!) would be rather misleading there.
In any case, when using full precision times for cases with limited precision, it would be good to create a time point for RDF based on a uniform rule. Easiest option that requires no calendar support: use the earliest second that is within the given interval. So "20th century" would always lead to the time point "1900-01-01T00:00:00". If this is not done, it will be very hard to query for all uses of "20th century" in the data.
This is an issue which the cultural heritage community has been dealing with for decades (:-) ).
In short, a single date is never going to do an adequate job of representing (a) a period over which an event happened and (b) uncertainty over the start and/or end point in this period. These periods will almost never neatly fit into years, decades, centuries, etc.: these are just a convenience for grouping approximations together. Representing e.g. '3.1783 - 12.1820' as either decades or centuries is going to give a very misleading version of what you actually know about the period (and you still can't reduce it to a single 'date thing').
I think that you need at least two dates to represent historical event dating with any sort of honesty and flexibility. What those dates should be is a matter for discussion: the CIDOC CRM for example has the concept of "ongoing throughout" and "at some time within", which are respectively the minimal and maximal periods associated with an event. Common museum practice in the U.K. is to record 'start date' and 'end date', each with a possible qualification as regards its precision.
Richard
Markus
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
On 01.09.2015 10:53, Richard Light wrote:
On 01/09/2015 09:26, Markus Krötzsch wrote:
On 01.09.2015 05:17, Stas Malyshev wrote:
Hi!
I would have thought that the correct approach would be to encode these values as gYear, and just record the four-digit year.
While we do have a ticket for that (https://phabricator.wikimedia.org/T92009) it's not that simple since many triple stores consider dateTime and gYear to be completely different types and as such some queries between them would not work.
I agree. Our original RDF exports in Wikidata Toolkit are still using gYear, but I am not sure that this is a practical approach. In particular, this does not solve the encoding of time precisions in RDF. It only introduces some special cases for year (and also for month and day), but it cannot be used to encode decades, centuries, etc.
My current view is that it would be better to encode the actual time point with maximal precision, and to keep the Wikidata precision information independently. This applies to the full encoding of time values (where you have a way to give the precision as a separate value).
For the simple encoding, where the task is to encode a Wikidata time in a single RDF literal, things like gYear would make sense. At least full precision times (with time of day!) would be rather misleading there.
In any case, when using full precision times for cases with limited precision, it would be good to create a time point for RDF based on a uniform rule. Easiest option that requires no calendar support: use the earliest second that is within the given interval. So "20th century" would always lead to the time point "1900-01-01T00:00:00". If this is not done, it will be very hard to query for all uses of "20th century" in the data.
This is an issue which the cultural heritage community has been dealing with for decades (:-) ).
In short, a single date is never going to do an adequate job of representing (a) a period over which an event happened and (b) uncertainty over the start and/or end point in this period. These periods will almost never neatly fit into years, decades, centuries, etc.: these are just a convenience for grouping approximations together. Representing e.g. '3.1783 - 12.1820' as either decades or centuries is going to give a very misleading version of what you actually know about the period (and you still can't reduce it to a single 'date thing').
I think that you need at least two dates to represent historical event dating with any sort of honesty and flexibility. What those dates should be is a matter for discussion: the CIDOC CRM for example has the concept of "ongoing throughout" and "at some time within", which are respectively the minimal and maximal periods associated with an event. Common museum practice in the U.K. is to record 'start date' and 'end date', each with a possible qualification as regards its precision.
Similar considerations have influenced Wikidata to some extent: there are hidden "before" and "after" parameters for each time, which are intended to create a time interval around a "main" value. The idea, as I understand, was that "before" and "after" are non-negative integer numbers that specify the number of <precision> units for which the interval extends. For example, with precision set to "day", this would be numbers of whole days.
So far, this has not been implemented on the UI level, and many existing "before" and "after" values are somewhat random and cannot be used. My proposal would correspond to use the time point in such a way that it would fit to "before"=0 and "after"=1 to yield the current coarse-grained notion of precision.
In any case, it is clear that imprecise times on Wikidata always have an "at some time within" semantics. "Ongoing throughout" is captured by specifying "start date" and "end date" as you can see it on many statements.
Markus
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
-- *Richard Light*
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Hello Raul.
While there is indeed some inconsistency with year-precision dates (some use 01-01 for month and day, some use 00-00), I cannot reproduce the issue you report. Looking at the JSON form of Q216, I see +2014-00-00, as expected. I connot find 2013 anywhere in the JSON. Am I missing something?
Here is the entire statement in JSON:
[ { "mainsnak": { "snaktype": "value", "property": "P1082", "datavalue": { "value": { "amount": "+539939", "unit": "1", "upperBound": "+539940", "lowerBound": "+539938" }, "type": "quantity" }, "datatype": "quantity" }, "type": "statement", "qualifiers": { "P585": [ { "snaktype": "value", "property": "P585", "hash": "a1c4aa51810ae8ef53dd5e243264e9d977c02081", "datavalue": { "value": { "time": "+2014-00-00T00:00:00Z", "timezone": 0, "before": 0, "after": 0, "precision": 9, "calendarmodel": "http://www.wikidata.org/entity/Q1985727" }, "type": "time" }, "datatype": "time" } ] }, "qualifiers-order": [ "P585" ], "id": "Q216$2a0bbe8d-4281-d178-93b0-9e6ff904ea91", "rank": "normal", "references": [ { "hash": "3c680f0b30bc470385ebab96c739ddd1c84be724", "snaks": { "P854": [ { "snaktype": "value", "property": "P854", "datavalue": { "value": "http://db1.stat.gov.lt/statbank/selectvarval/saveselections.asp?MainTable=M3010211&PLanguage=1&TableStyle=&Buttons=&PXSId=9116&IQY=&TC=&ST=ST&rvar0=&rvar1=&rvar2=&rvar3=&rvar4=&rvar5=&rvar6=&rvar7=&rvar8=&rvar9=&rvar10=&rvar11=&rvar12=&rvar13=&rvar14=", "type": "string" }, "datatype": "url" } ] }, "snaks-order": [ "P854" ] } ] } ]
Am 31.08.2015 um 19:19 schrieb Raul Kern:
Hi, how is the datetime value with precision of one year stored?
For example for birt date in https://www.wikidata.org/wiki/Q299687 fine grain value for "1700" is "1.01.1700"
But for population date field in https://www.wikidata.org/wiki/Q216 the fine grain value for "2014" is "30.11.2013" Which is kind of unexpected.
-- Raul
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
wikidata-tech@lists.wikimedia.org