Re: SQL:2011 application time

From: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL:2011 application time
Date: 2021-11-21 01:51:16
Message-ID: CA+renyWqpNdsi9OaPmXheeQL72adi_iz36VgVsqyyGpQ20FDAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 16, 2021 at 3:55 PM Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>
wrote:

> I haven't made any substantive changes, but I should have time soon to
> take a stab at supporting partitioned tables and removing some of my own
> TODOs (things like making sure I'm locking things correctly).
>

Hello,

Here are updated patches. They are rebased and clean up some of my TODOs.
Here is what remains:

- Various TODOs asking for advice about concurrency things: where to lock,
when to copy structs, etc. I'd appreciate some review on these from someone
more experienced than me.

- Supporting FOR PORTION OF against updateable views. I'll keep working on
this, but I thought there was enough progress to pass along new patches in
the meantime.

- Support partitioned tables. I think this is a medium-size effort, and I'm
not sure whether it's really needed for pg 15 or something we can add
later. I'm going to do my best to get it done though. (I should have more
time for this project now: having a sixth baby recently made side projects
challenging for a while, but lately things have been getting easier.)
Partitioning could use some design discussion though, both for application
time alone and for bitemporal tables (so overlapping with the system time
work). Here are some thoughts so far:

- Creating a PERIOD on a partitioned table should automatically create
the PERIOD (and associated constraints) on the child tables. This one seems
easy and I'll try to get it done soon.

- Sort of related, but not strictly partitioning: CREATE TABLE LIKE
should have a new INCLUDING PERIODS option. (I'm tempted to include this
under INCLUDING CONSTRAINTS, but I think a separate option is nicer since
it gives more control.)

- If you partition by something in the scalar part of the temporal PK,
that's easy. I don't think we have to do anything special there. I'd like
to add some tests about it though.

- We should allow temporal primary keys on the top-level partitioned
table, even though they are essentially exclusion constraints. Whereas in
the general case an exclusion constraint cannot prove its validity across
all the tables, a temporal PK *can* prove its validity so long the
partition key includes at least one scalar part of the temporal PK (so that
all records for one "entity" get routed to the same table).

- If you partition by the temporal part of the temporal PK, things are
harder. I'm inclined to forbid this, at least for v15. Suppose you
partition by the start time. Then you wind up with the same entity spread
across several tables, so you can't validate the overall exclusion
constraint anymore.

- OTOH you *could* partition by application-time itself (not start time
alone nor end time alone) where each partition has application-time
ranges/periods that are trimmed to fit within that partition's limits. Then
since each partition is responsible for a non-overlapping time period, you
could validate the overall exclusion constraint. You'd just have to add
some logic to tuple re-routing that could transform single records into
multiple records. For example if each partition holds a different year and
you INSERT a record that is valid for a decade, you'd have to insert one
row into ten partitions, and change the application-time range/period of
each row appropriately. This is a special kind of range partitioning. I
don't have any ideas how to make hash or list partitioning work on the
temporal part of the PK. I don't think we should allow it.

- Partitioning by application time requires no special syntax.
Partitioning by system time (if that's desired) would probably require
extra (non-standard) syntax. Mariadb has this:
https://mariadb.com/kb/en/system-versioned-tables/#storing-the-history-separately
Perhaps that is orthogonal to application-time partitioning though. It
sounds like people think we should store non-current system time in a
separate table (I agree), and in that case I think a bitemporal table that
is partitioned by scalar keys or application-time would just have a
separate system-time history table for each partition, and that would Just
Work. And if we *do* want to partition by system time too, then it would be
transparent to the application-time logic.

- Since system time doesn't add anything to your PK (or at least it
shouldn't), there is no extra complexity around dealing with exclusion
constraints. We should just guarantee that all *current* rows land in the
same partition, because for a bitemporal table that's the only one that
needs a temporal PK. I guess that means you could partition by end
system-time but not start system-time. This would be an exception to the
rule that a PK must include the partition keys. Instead we'd say that all
current (i.e. non-historical) records stay together (at the system-time
level of partitioning).

- I don't think system-time partitioning needs to be in v15. It seems
more complicated than ordinary partitioning.

Yours,
Paul

Attachment Content-Type Size
v10-0001-Add-PERIODs.patch application/octet-stream 117.4 KB
v10-0002-Add-temporal-PRIMARY-KEY-and-UNIQUE-constraints.patch application/octet-stream 83.4 KB
v10-0003-Add-UPDATE-DELETE-FOR-PORTION-OF.patch application/octet-stream 146.6 KB
v10-0004-Add-temporal-FOREIGN-KEYs.patch application/octet-stream 304.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Todd Hubers 2021-11-21 02:05:16 Re: Feature Proposal: Connection Pool Optimization - Change the Connection User
Previous Message Michael Paquier 2021-11-21 01:41:17 Re: VS2022: Support Visual Studio 2022 on Windows