On 01.09.2015 05:17, Stas Malyshev wrote:
Hi!
I would have thought that the correct approach
would be to encode these
values as gYear, and just record the four-digit year.
While we do have a ticket for that
(
https://phabricator.wikimedia.org/T92009) it's not that simple since
many triple stores consider dateTime and gYear to be completely
different types and as such some queries between them would not work.
I agree. Our original RDF exports in Wikidata Toolkit are still using
gYear, but I am not sure that this is a practical approach. In
particular, this does not solve the encoding of time precisions in RDF.
It only introduces some special cases for year (and also for month and
day), but it cannot be used to encode decades, centuries, etc.
My current view is that it would be better to encode the actual time
point with maximal precision, and to keep the Wikidata precision
information independently. This applies to the full encoding of time
values (where you have a way to give the precision as a separate value).
For the simple encoding, where the task is to encode a Wikidata time in
a single RDF literal, things like gYear would make sense. At least full
precision times (with time of day!) would be rather misleading there.
In any case, when using full precision times for cases with limited
precision, it would be good to create a time point for RDF based on a
uniform rule. Easiest option that requires no calendar support: use the
earliest second that is within the given interval. So "20th century"
would always lead to the time point "1900-01-01T00:00:00". If this is
not done, it will be very hard to query for all uses of "20th century"
in the data.
Markus