The issues unravel like a ball of string when you look at time.

There is all the cultural stuff,  then there is the astronomy,  geodesy and physics that frustrate you if you want to get it right.  Leap seconds,  Congress changing daylight savings time,  relativity,  etc.

The Allen algebra is,  I think,  the most useful theory of time.  Here you look at "times" as a union of intervals,  which gives a clear way to say "Thorsday" or "Easter" or "Ramadan" or the times the Burger King down the road is open.

You can use intervals to specify the accuracy of a date (i.e. when we say that Franz Kafka was born on July 23, 1883 this a truncation of a birth "event" which could be timed to a minute or so.)  On the other hand,  if we say there is a public festival such as "Tanno no Tanjobi" (the emperor's birthday)  or just a reference to the actual day of "July 23, 1883" the interval algebra handles that too.  There is some conflation,  in practice it is not so bad and with the property graph model Wikidata uses you can stick a statement that qualifies ~your~ point of view of how to think about it.

The good news too is that "unusual" use cases of time are remarkably unusual.  For instance,  some W3C standards suggest you can write an ISO date like

17413-04-07

adding one more or more digits to the year.  Practically nobody uses this because precise dates aren't known to prior civilization other than for astronomical events;  at +-10,000 the defects of common calendars are showing and if you go out to +-100,000 the errors in all of the models.  It is not uncommon for science fiction writers to give specific years,  months and dates in the 2000-2999 range,  but incredibly unusal after +10,000.  People who build nuclear waste dumps need to think about times in that 10-100kyear range,  but after the facility closes,  it doesn't matter if a day is a Sunday or a Monday.

The same is true for calendars.  For instance,  if a document is signed in Saudi Arabia,  it may have an Islamic date stamped on.  As a westerner who wants to know when the document is signed,  you are better served seeing a western date,  at least on first glance.

In a western cultural zone you can squash dates from other cultural zones to your own date system.  I think it would be awesome if the Arabic slice of Wikidata/Wikipedia had Islamic dates for dates and western ones for our cultural zone and if you could flip a switch and see the one you want (or look up which year of the emperor it is.)

The design of the JDK 8 java.time framework is sound and ought to give some inspirations as to how to think about calendars.

https://docs.oracle.com/javase/8/docs/api/java/time/package-summary.html

I'd say make a core type that handles Gregorian dates and supports the Allen Algebra,  then you can define more data types and extensions to properly handle other calendar systems. 

On Thu, Jul 2, 2015 at 10:16 AM, Neil Harris <neil@tonal.clara.co.uk> wrote:
On 01/07/15 15:00, Pierpaolo Bernardi wrote:
On Wed, Jul 1, 2015 at 8:17 AM, Markus Krötzsch
<markus@semantic-mediawiki.org> wrote:
Dear Pierpaolo,

This thread was only about Julian and Gregorian calendar dates. If and how
other calendar models should be supported in some future is another
(potentially big) discussion. As you said, there are many issues there.
Let's first make sure that we handle the "easy" 99.9% of cases correctly
before discussing any more complicated options.


Just for future reference, if anyone's interested, THE book on this topic is "Calendrical Calculations".

Alas, their code is closed-source, but the book is still the best reference I know of.

http://emr.cs.iit.edu/home/reingold/calendar-book/third-edition/

-- Neil



_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



--
Paul Houle

Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes