Re: WIP: System Versioned Temporal Table

From: Hannu Krosing <hannuk(at)google(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, vignesh C <vignesh21(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Rémi Lapeyre <remi(dot)lapeyre(at)lenstra(dot)fr>, Ryan Lambert <ryan(at)rustprooflabs(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Eli Marmor <eli(at)netmask(dot)it>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Li Japin <japinli(at)hotmail(dot)com>
Subject: Re: WIP: System Versioned Temporal Table
Date: 2021-09-20 09:49:25
Message-ID: CAMT0RQR4XHFyced+FzEaRdwOc3hpBRPHC+26WK3-OvZaz1vVHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 20, 2021 at 7:09 AM Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>
> On Sun, Sep 19, 2021 at 3:12 PM Hannu Krosing <hannuk(at)google(dot)com> wrote:
>>
>> A side table has the nice additional benefit that we can very easily
>> version the *table structure* so when we ALTER TABLE and the table
>> structure changes we just make a new side table with now-currents
>> structure.
>
>
> It's true that would allow for perfect capture of changes to the table structure, but how would you query the thing?
>
> If a system versioned table was created with a column foo that is type float, and then we dropped that column, how would we ever query what the value of foo was in the past?

We can query that thing only in tables AS OF the time when the column
was still there.

We probably could get away with pretending the dropped columns to be
NULL in newer versions (and the versions before the column was added)

Even more tricky case would be changing the column data type.

>
> Would the columns returned from SELECT * change based on the timeframe requested?

If we want to emulate real table history, then it should.

But the * thing was not really specified well even for original
PostgreSQL inheritance.

Maybe we could do SELECT (* AS OF 'yesterday afternoon'::timestamp) FROM ... :)

> If we then later added another column that happened to also be named foo but now was type JSONB, would we change the datatype returned based on the time period being queried?

Many databases do allow returning multiple result sets, and actually
the PostgreSQL wire *protocol* also theoretically supports this, just
that it is not supported by any current client library.

With current libraries it would be possible to return a dynamic
version of row_to_json(t.*) which changes based on actual historical
table structure

> Is the change in structure a system versioning which itself must be captured?

We do capture it (kind of) for logical decoding. That is, we decode
according to the structure in effect at the time of row creation,
though we currently miss the actual DDL itself.

So there is a lot to figure out and define, but at least storing the
history in a separate table gives a good foundation to build upon.

-----
Hannu Krosing
Google Cloud - We have a long list of planned contributions and we are hiring.
Contact me if interested.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2021-09-20 09:53:57 Re: [PATCH] Full support for index LP_DEAD hint bits on standby
Previous Message Ajin Cherian 2021-09-20 09:47:43 Re: row filtering for logical replication