Re: WIP: System Versioned Temporal Table

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, vignesh C <vignesh21(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Rémi Lapeyre <remi(dot)lapeyre(at)lenstra(dot)fr>, Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Eli Marmor <eli(at)netmask(dot)it>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Li Japin <japinli(at)hotmail(dot)com>
Subject: Re: WIP: System Versioned Temporal Table
Date: 2021-09-19 00:15:52
Message-ID: CADkLM=ca+9gYjE2bfPWu-rJFHQjGcwOkEep8d841-x4=-h8EZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
>
> 1. Much of what I have read about temporal tables seemed to imply or
> almost assume that system temporal tables would be implemented as two
> actual separate tables. Indeed, SQLServer appears to do it that way [1]
> with syntax like
>
> WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WebsiteUserInfoHistory));
>
>
> Q 1.1. Was that implementation considered and if so, what made this
> implementation more appealing?
>
>
I've been digging some more on this point, and I've reached the conclusion
that a separate history table is the better implementation. It would make
the act of removing system versioning into little more than a DROP TABLE,
plus adjusting the base table to reflect that it is no longer system
versioned.

What do you think of this method:

1. The regular table remains unchanged, but a pg_class attribute named
"relissystemversioned" would be set to true
2. I'm unsure if the standard allows dropping a column from a table while
it is system versioned, and the purpose behind system versioning makes me
believe the answer is a strong "no" and requiring DROP COLUMN to fail
on relissystemversioned = 't' seems pretty straightforward.
3. The history table would be given a default name of $FOO_history (space
permitting), but could be overridden with the history_table option.
4. The history table would have relkind = 'h'
5. The history table will only have rows that are not current, so it is
created empty.
6. As such, the table is effectively append-only, in a way that vacuum can
actually leverage, and likewise the fill factor of such a table should
never be less than 100.
7. The history table could only be updated only via system defined triggers
(insert,update,delete, alter to add columns), or row migration similar to
that found in partitioning. It seems like this would work as the two tables
working as partitions of the same table, but presently we can't have
multi-parent partitions.
8. The history table would be indexed the same as the base table, except
that all unique indexes would be made non-unique, and an index of pk +
start_time + end_time would be added
9. The primary key of the base table would remain the existing pk vals, and
would basically function normally, with triggers to carry forth changes to
the history table. The net effect of this is that the end_time value of all
rows in the main table would always be the chosen "current" value
(infinity, null, 9999-12-31, etc) and as such might not actually _need_ to
be stored.
10. Queries that omit the FOR SYSTEM_TIME clause, as well as ones that use
FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, would simply use the base table
directly with no quals to add.
11. Queries that use FOR SYSTEM_TIME and not FOR SYSTEM_TIME AS
OF CURRENT_TIMESTAMP, then the query would do a union of the base table and
the history table with quals applied to both.
12. It's a fair question whether the history table would be something that
could be queried directly. I'm inclined to say no, because that allows for
things like SELECT FOR UPDATE, which of course we'd have to reject.
13. If a history table is directly referenceable, then SELECT permission
can be granted or revoked as normal, but all insert/update/delete/truncate
options would raise an error.
14. DROP SYSTEM VERSIONING from a table would be quite straightforward -
the history table would be dropped along with the triggers that reference
it, setting relissystemversioned = 'f' on the base table.

I think this would have some key advantages:

1. MVCC bloat is no worse than it was before.
2. No changes whatsoever to referential integrity.
3. DROP SYSTEM VERSIONING becomes an O(1) operation.

Thoughts?

I'm going to be making a similar proposal to the people doing the
application time effort, but I'm very much hoping that we can reach some
consensus and combine efforts.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2021-09-19 00:46:16 Re: SQL:2011 application time
Previous Message Dean Rasheed 2021-09-18 21:44:42 Re: So, about that cast-to-typmod-minus-one business