Re: AS OF queries

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: AS OF queries
Date: 2017-12-29 11:28:34
Message-ID: 5ba86346-40ff-e546-8e40-80ee7c1b4b78@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28.12.2017 20:28, Peter Eisentraut wrote:
> On 12/28/17 11:36, Konstantin Knizhnik wrote:
>> Attached please find new version of AS OF patch which allows to specify
>> time travel period.
>> Older versions outside this period may be reclaimed by autovacuum.
>> This behavior is controlled by "time_travel_period" parameter.
> So where are we on using quasi SQL-standard syntax for a nonstandard
> interpretation? I think we could very well have a variety of standard
> and nonstandard AS OF variants, including by commit timestamp, xid,
> explicit range columns, etc. But I'd like to see a discussion on that,
> perhaps in a documentation update, which this patch is missing.
SQL:2011 ||defines rules for creation and querying of temporal tables.
I have not read this standard myself, I just take information about it
from wikipedia:
https://en.wikipedia.org/wiki/SQL:2011
According to this standard time-sliced queries are specified using
|
AS OF SYSTEM TIME| |and| |VERSIONS BETWEEN SYSTEM TIME ... AND ...|clauses.

Looks like it is supported now only by Oracle. IBM DB, MS-SQL, are
providing similar functionality in slightly different way.
I am not sure whether strict support of SQL:2011 standard is critical
and which other functionality we need.

> I have questions about corner cases. What happens when multiple tables
> are queried with different AS OF clauses?
It is possible.

> Can there be apparent RI
> violations?
Right now AS OF is used only in selects, not in update statements. So I
do not understand how integrity constraints can be violated.

> What happens when the time_travel_period is changed during
> a session?
Right now it depends on autovacuum: how fast it will be able to reclaim
old version.
Actually I I do not see much sense in changing time travel period during
session.
In asof-4.patch time_travel_period is postmaster level GUC which can not
be changed in session.
But I have changed policy for it for SIGHUP to make experiments with it
more easier.

> How can we check how much old data is available, and how can
> we check how much space it uses?
Physical space used by the database/relation can be determined using
standard functions, for example pg_total_relation_size.
I do not know any simple way to get total number of all stored versions.
> What happens if no old data for the
> selected AS OF is available?
It will just return the version closest to the specified timestamp.

> How does this interact with catalog
> changes, such as changes to row-level security settings? (Do we apply
> the current or the past settings?)
Catalog changes are not currently supported.
And I do not have good understanding how to support it if query involves
two different timeslice with different versions of the table.
Too much places in parser/optimizer have to be change to support such
"historical collisions".

> This patch should probably include a bunch of tests to cover these and
> other scenarios.
Right now I have added just one test: asof.sql.
It requires "track_commit_timestamp" option to be switched on and it is
postmaster level GUC.
So I have added for it postgresql.asof.conf and asof_schedule.
This test should be launched using the following command:

make check EXTRA_REGRESS_OPTS="--schedule=asof_schedule
--temp-config=postgresql.asof.config"

If there is some better way to include this test in standard regression
tests, please let me know.
> (Maybe "period" isn't the best name, because it implies a start and an
> end. How about something with "age"?)
Well I am not an English native speaker. So I can not conclude what is
more natural.
"period" is widely used in topics related with temporal tables (just
count occurrences of this word at https://en.wikipedia.org/wiki/SQL:2011)
Age is not used here at all.
From my point of view age is something applicable to person, building,
monument,...
It is not possible to say about "ge of time travel". In science fiction
"time machines" frequently have limitations: you can not got more than N
years in the past.
How we can name this N? Is it "period", "age" or something else?

I attached yet another version of the patch which includes test for AS
OF query.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
asof-5.patch text/x-patch 41.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-12-29 11:49:15 Re: Basebackups reported as idle
Previous Message Yugo Nagata 2017-12-29 10:17:05 Fix a Oracle-compatible instr function in the documentation