Re: SQL:2011 application time

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL:2011 application time
Date: 2021-09-13 06:05:41
Message-ID: CADkLM=d-Qaq0wJNjV5K1jN9+CSNY1B9uXodAnM8dxwaDFxVCfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So I've been eagerly watching this thread and hoping to have time to devote
to it. I've also been looking at the thread at
https://www.postgresql.org/message-id/CALAY4q8Pp699qv-pJZc4toS-e2NzRJKrvaX-xqG1aqj2Q+Ww-w@mail.gmail.com
that covers system versioning, and per our conversation far too long ago
(again, my bad) it's obvious that the two efforts shouldn't do anything
that would be in conflict with one another, as we eventually have to
support bitemporal [1] tables: tables that have both system versioning and
an application period.

Below is a list of observations and questions about this proposed patch of
itself in isolation, but mostly about how it relates to the work being done
for system versioning.

1. This patch creates a pg_period catalog table, whereas the system
versioning relies on additions to pg_attribute to identify the start/end
columns. Initially I thought this was because it was somehow possible to
have *multiple* application periods defined on a table, but in reading [1]
I see that there are some design suppositions that would make a second
application period impossible[2]. I can also see where having this table
would facilitate the easy creation of INFORMATION_SCHEMA.PERIODS. I was
previously unaware that this info schema table was a thing, but I have
found references to it, though I'm unclear as to whether it's supposed to
have information about system versioned tables in it as well.

Q 1.1. Would a bitemporal table have two entries in that view?
Q 1.2. Could you see being able to implement this without pg_period, using
only additions to pg_attribute (start/end for system temporal, start/end
for application, plus an addition for period name)?
Q 1.3. Can you see a way to represent the system versioning in pg_period
such that bitemporal tables were possible?

2. The system versioning effort has chosen 'infinity' as their end-time
value, whereas you have chosen NULL as that makes sense for an unbounded
range. Other databases seem to leverage '9999-12-31 23:59:59' (SQLServer,
IIRC) whereas some others seem to used '2999-12-31 23:59:59' but those
might have been home-rolled temporal implementations. To further add to the
confusion, the syntax seems to specify the keyword of MAXVALUE, which
further muddies things. The system versioning people went with 'infinity'
seemingly because it prescribe and end to the world like SQLServer did, but
also because it allowed for a primary key based on (id, endtime) and that's
just not possible with NULL endtime values.

Q 2.1. Do you have any thoughts about how to resolve this notational logjam?

3. I noticed some inconsistency in the results from various "SELECT * FROM
portion_of_test" examples. In some, the "valid_at" range is shown but not
columns that make it up, and in some others, the "valid_from" and
"valid_to" columns are shown, with no mention of the period. From what I've
seen, the period column should be invisible unless invoked, like ctid or
xmin.

4. The syntax '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR TO MINUTE simply
confounded me. I googled around for it, but could find no matches for
postgres exception in mailing list discussions circa 2003. I tried it out
myself and, lo and behold

# SELECT '2018-03-04' AT TIME ZONE INTERVAL '2' HOUR TO MINUTE;
timezone
---------------------
2018-03-04 05:02:00
(1 row)

I really didn't expect that to work, or even "work". I can see that it
added 2 minutes to UTC's perspective on my local concept of midnight, but I
don't understand what it's supposed to mean.

Q 4.1. What does it mean?

5. I haven't seen any actual syntax conflicts between this patch and the
system versioning patch. Both teams added basically the same keywords,
though I haven't dove more deeply into any bison incompatibilities. Still,
it's a great start.

6. Overall, I'm really excited about what this will mean for data
governance in postgres.

[1]
https://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf
[2] In the bitemporal table example in [1] - the application period get the
defined primary key, and the system_time period would be merely unique

On Mon, Sep 13, 2021 at 12:12 AM Paul A Jungwirth <
pj(at)illuminatedcomputing(dot)com> wrote:

> On Fri, Sep 10, 2021 at 6:50 PM Jaime Casanova
> <jcasanov(at)systemguards(dot)com(dot)ec> wrote:
> >
> > patch 01: does apply but gives a compile warning (which is fixed by patch
> > 02)
> > [snip]
> > patch 03: produces these compile errors.
>
> I did a rebase and fixed this new error, as well as the warnings.
>
> On Mon, Sep 6, 2021 at 1:40 PM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> >
> > + * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
> >
> > It seems the year (2018) should be updated to 2021.
>
> Done.
>
> > For RemovePeriodById(), it seems table_open() can be called after
> SearchSysCache1(). This way, if HeapTupleIsValid(tup) is true, table_open()
> can be skipped.
>
> This seems like it permits a race condition when two connections both
> try to drop the period, right?
>
> > For tablecmds.c, AT_PASS_ADD_PERIOD is defined as 5 with
> AT_PASS_ADD_CONSTR etc moved upward. Do we need to consider compatibility ?
>
> I don't think there is a compatibility problem---can you explain?
> These symbols aren't used outside tablecmds.c and the values aren't
> saved anywhere AFAIK.
>
> > There are a few TODO's such as:
> > Are they going to be addressed in the next round of patches ?
>
> These are mostly questions I'm hoping a reviewer can help me answer,
> but I'll take a pass through them and see which I can remove myself.
> Several are for adding support for partitioned tables, where I would
> definitely appreciate help.
>
> > There seems to be some overlap between ATExecAddPeriod() and
> AddRelationNewPeriod().
> > Is it possible to reduce code duplication ?
>
> I've refactored those functions to remove some duplication, but I
> think I prefer the old version---let me know if you have suggestions
> to avoid the duplication in a nicer way.
>
> Oh also I realized fp_triggers.c wasn't included in the last few patch
> files---I'm sorry about that!
>
> Latest files attached. Thanks for the reviews!
>
> Paul
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2021-09-13 06:19:39 Re: brin multi minmax crash for inet value
Previous Message Fujii Masao 2021-09-13 05:56:11 Re: corruption of WAL page header is never reported