Re: Periods

From: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Vik Fearing <vik(dot)fearing(at)protonmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Periods
Date: 2019-07-05 19:07:37
Message-ID: CA+renyUCxrwdiVHyngqEEboG87+Xfk0T01O6MRg_cPsKXRbWMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 4, 2019 at 11:44 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> I think that the functionality in your patch is already integrated in
> Paul's patch for temporal PK/FK elsewhere ... is that correct, or is
> this patch orthogonal to that work?

Hi Vik, I wasn't aware that you had moved your work over to an
extension. It looks like you've made a lot of progress! I'd be eager
to work with you on getting this into core.

Alvaro: it's a mix of orthogonal and not orthogonal. :-) My work lets
you use range types directly in SQL:2011 constructs, whereas Vik's
work lets you use SQL:2011 PERIODs. I would like to support *both*, so
I've been intending to add Vik's original patch letting you define
PERIODs to my own work.

There is no conflict in supporting both ranges and PERIODs because you
can't name a PERIOD the same as an existing column.

Behind the scenes the easiest way to implement PERIODs is with range
types, so there would be very little duplication to permit both.

PERIODs suffer from being outside relational theory as a
quasi-attribute. You especially see this in composing queries & result
sets. For example with ranges you could say this:

WITH x AS (
SELECT * FROM y
FOR PORTION OF some_range FROM t1 TO t2
)
SELECT * FROM x
FOR PORTION OF some_range FROM t3 TO t4
;

But with PERIODs you can't because a PERIOD is not included in `SELECT
*`. Also it's unclear how "attached" it is to a table definition. Can
you say `SELECT *, some_period FROM x`? If so can you then use `FOR
PORTION OF` on that result set? Can you construct an on-the-fly PERIOD
expression? Can you pass a period to a function as a parameter? Can a
function return a period? Can you ORDER BY a period? GROUP BY one? Can
you cast to/from a period? Can you ask a period for its high/low
values? Do we treat a PERIOD as a whole new datatype? Can you define a
real column of type PERIOD? I haven't found text from the standard
that answers most of these questions. (The standard does say you can
construct a `PERIOD(t1, t2)` expression but apparently only inside a
"period predicate".)

Also you can't define PERIODs on types other than
date/timestamp/timestamptz, unlike ranges.

Also PERIODs require a sentinel value to mean "unbounded" (e.g.
31-JAN-9999) whereas ranges let you express that with a NULL.
(Postgres does have Infinity and -Infinity for timestamp types but
I've noticed that client programming languages can't always express
ranges with those values.)

Personally I intend to use ranges any time I build temporal tables,
but supporting PERIODs might have value for people more interested in
database portability or someone migrating from elsewhere to Postgres.

I had some conversations at PGCon that I felt validated the
permit-PERIODS-or-ranges approach, so I'm about ready to expand my
patch to handle PERIODs too. For that I would love to draw on Vik's
work so far. I think his original patch against core is more likely to
be helpful than the extension, but I'll certainly consult both, and
Vik if you have any advice let me know! :-)

A big difference between a temporal extension vs temporal features in
core is implementing DML. An extension pretty much requires you to use
INSTEAD OF triggers. Also Vik's README points out that implementing
temporal DELETE is hard that way. In core I believe you'd do temporal
DML in the executor node. (That's my working theory anyway; I'm still
new to that part of the code.)

The first thing on my TODO list is to write a blog post comparing how
other RDMBSes handle PERIODs and other temporal features. Besides the
questions above, how does a trigger work on a table? For example when
you DELETE in the middle of a range/period, and it becomes an INSERT
plus an UPDATE, I *believe* you still fire the DELETE trigger. And you
need to set the NEW/OLD tuples appropriately. You *don't* fire any
INSERT & UPDATE triggers. The standard isn't super explicit but that's
my take on it, and I want to write down what other vendors are doing.

Yours,
Paul

In response to

  • Re: Periods at 2019-07-04 18:44:33 from Alvaro Herrera

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-07-05 19:30:35 Re: Use relative rpath if possible
Previous Message Bruce Momjian 2019-07-05 18:52:27 Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.