Re: Disable vacuuming to provide data history

From: Hannu Krosing <hannuk(at)google(dot)com>
To: Vik Fearing <vik(at)postgresfriends(dot)org>
Cc: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, marekmosiewicz(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Disable vacuuming to provide data history
Date: 2023-03-26 15:19:18
Message-ID: CAMT0RQSm7o511W1Ga3Z3YFKkeCgtVKrar0hVzacfJ2-ERXhXsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There is also another blocker - our timestamp resolution is 1
microsecond and we are dangerously close to speeds where one could
update a row twice in the same microsecond .

I have been thinking about this, and what is needed is

1. a nanosecond-resolution "abstime" type - not absolutely necessary,
but would help with corner cases.
2. VACUUM should be able to "freeze" by replacing xmin/xmax values
with commit timestamps, or adding tmin/tmax where necessary.
3. Optionally VACUUM could move historic rows to archive tables with
explicit tmin/tmax columns (this also solves the pg_dump problem)

Most of the above design - apart from the timestamp resolution and
vacuum being the one doing stamping in commit timestamps - is not
really new - up to version 6.2 PostgreSQL had tmin/tmax instead of
xmin/xmax and you could specify the timestamp you want to query any
table at.

And the original Postgres design was Full History Database where you
could say " SELECT name, population FROM cities['epoch' .. 'now'] " to
get all historic population values.

And historic data was meant to be moved to the WORM optical drives
which had just arrived to the market

---
Hannu

On Sat, Feb 25, 2023 at 3:11 AM Vik Fearing <vik(at)postgresfriends(dot)org> wrote:
>
> On 2/24/23 22:06, Corey Huinker wrote:
> > On Thu, Feb 23, 2023 at 6:04 AM <marekmosiewicz(at)gmail(dot)com> wrote:
> >
> > [1] some implementations don't use null, they use an end-timestamp set to
> > a date implausibly far in the future ( 3999-12-31 for example ),
>
> The specification is, "At any point in time, all rows that have their
> system-time period end column set to the highest value supported by the
> data type of that column are known as current system rows; all other
> rows are known as historical system rows."
>
> I would like to see us use 'infinity' for this.
>
> The main design blocker for me is how to handle dump/restore. The
> standard does not bother thinking about that.
> --
> Vik Fearing
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2023-03-26 17:42:42 Re: Schema variables - new implementation for Postgres 15
Previous Message Jeff Janes 2023-03-26 15:12:48 awkward cancellation of parallel queries on standby.