Re: Proposal - temporal contrib module

From: Scott Bailey <artacus(at)comcast(dot)net>
To:
Cc: hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal - temporal contrib module
Date: 2009-10-29 18:38:51
Message-ID: 4AE9E13B.7040303@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>> I would like to add a temporal contrib module. The most important piece
>> would be adding a period data type and some support functions. Jeff
>> Davis and I both have temporal projects on pgFoundry, and we've been
>> collaborating for a while.
>
> I presume you're going to need some backend support and possibly new
> syntax for some of the operations, right? That seems more urgent to
> discuss than the possible inclusion into contrib.

Jeff Davis is already working on solving these issues for 8.5. But
rather than wait until 8.6 or later to get a period data type added to
core, I felt it was important to get the period type out in front of
people to start using and testing. Plus we wanted to gauge interest from
the community. Should we forge ahead and try to become the first general
purpose database with support for temporal databases? Or should we wait
another 20 years and see if an official specification materializes?

> I'm very pleased to see people working on temporal issues, BTW! I used
> to work on a database that did a lot of temporal operations, but the
> DBMS didn't have any temporal data types or operations so we had to use
> a lot of triggers etc. to achieve that, and it didn't perform well.
>
>> Nulls - A common use case for periods is for modeling valid time. Often
>> the end point is not known. For instance, you know when an employee has
>> been hired but the termination time typically wouldn't be known ahead of
>> time. We can either represent these with a null end time or with
>> infinity. But I'm not sure how to deal with them. Obviously we can test
>> for containment and overlap. But what about length or set operations?
>
> Hmm. Infinity feels like a better match. The behavior of length and set
> operations falls out of that naturally. For example, length of a period
> with an infinite beginning or end is infinite. For set operations, for
> example the intersection of [123, infinity] and [100, 160] would be
> [123, 160].

Two different answers from two respondents. And is there a conceptual
difference between NULL and +/- infinity? Nothing lasts forever. So when
would it make sense to use one verses the other? So in the example I gave

>> Non-contiguous Sets - A period defines a contiguous set of time. But
>> many times we need to work with non-contiguous sets (work shifts in a
>> week, bus schedules, etc). Right now, I'm using period arrays. But
>> period arrays can contain overlapping and adjacent periods. And we have
>> no way to indicate that a period array has been coalesced into a
>> non-contiguous set. And what indexing strategies could be used with
>> non-contiguous sets?
>
> I'd stick to your current definition that a period is a contiguous set
> of time. A non-contiguous set consists of multiple contiguous periods,
> so it can be represented as multiple rows in a table.

That's pretty much my sentiments exactly. But Jeff wanted to be sure
that we didn't make a decision now that would limit it's usefulness later.

>> Temporal Keys - We need two types of temporal keys. A primary key,
>> exclusion type prevents overlap so someone isn't at two places at the
>> same time. And a foreign key, inclusion type so we can check that the
>> valid time of a child is contained with in the valid time of the parent.
>> Jeff is working on the former, but there is no easy way to do the latter.
>
> I'm very excited about this. Foreign keys don't seem that hard, you'll
> need foreign key triggers like we have today, but check for "within"
> instead of "equal".
>
>> Temporal Data and the Relational Model - Date et al
>> http://books.google.com/books?isbn=1558608559
>
> +1 for the approach in this book. I'm not familiar enough with the TSQL2
> spec to say whether it follows it.
>
> It should also be kept in mind that although this class of problems are
> generally thought of as temporal issues, IOW dealing with time, the same
> approach works with ranges of integers or any other datatype with a
> well-defined sort order. It would be nice if the temporal data type
> would allow that too.

The period concept relates very closely to mathematical intervals. (In
fact, I would argue that the SQL interval should actually be named
period and the SQL period should be named interval so they matched their
mathematical counterparts.) My primary concern is timestamp intervals,
but I see no reason the exact same concepts wouldn't apply to intervals
of integers, floats, dates, etc.

And actually there is a fair amount of overlap with spatial. The main
difference being the number of dimensions. But the concepts of overlap,
containment, and set operations like union and intersection are the same.

Scott Bailey

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2009-10-29 19:05:50 Re: Proposal - temporal contrib module
Previous Message Jeff Davis 2009-10-29 18:25:48 Re: Proposal - temporal contrib module