Re: AS OF queries

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: AS OF queries
Date: 2018-01-10 09:24:02
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02.01.2018 21:12, Peter Eisentraut wrote:
> On 12/29/17 06:28, Konstantin Knizhnik wrote:
>>> Can there be apparent RI
>>> violations?
>> Right now AS OF is used only in selects, not in update statements. So I
>> do not understand how integrity constraints can be violated.
> I mean, if you join tables connected by a foreign key, you can expect a
> certain shape of result, for example at least one match per PK row. But
> if you select from each table "as of" a different timestamp, then that
> won't hold. That could also throw off any optimizations we might come
> up with in that area, such as cross-table statistics. Not saying it
> can't or shouldn't be done, but there might be some questions.

Now I understand your statement. Yes, combining different timelines in
the same query can violate integrity constraint.
In theory there can be some query plans which will be executed
incorrectly  because of this constraint violation.
I do not know concrete examples of such plans right now, but I can not
prove that such problem can  not happen.

>>> What happens if no old data for the
>>> selected AS OF is available?
>> It will just return the version closest to the specified timestamp.
> That seems strange. Shouldn't that be an error?

I will add an option raising error in this case.
I do not want to always throw error, because Postgres is very
conservative in reclaiming old space. And the fact that version is not
used by any snapshot doesn't mean that it will be immediately deleted.
So there is still chance to peek-up old data although it is out of the
specified time travel period.

>>> How does this interact with catalog
>>> changes, such as changes to row-level security settings? (Do we apply
>>> the current or the past settings?)
>> Catalog changes are not currently supported.
>> And I do not have good understanding how to support it if query involves
>> two different timeslice with different versions of the table.
>> Too much places in parser/optimizer have to be change to support such
>> "historical collisions".
> Right, it's probably very hard to do. But I think it somehow should be
> recognized that catalog changes took place between the selected
> timestamp(s) and now and an error or notice should be produced.
There is one challenge: right now AS OF timestamps are not required to
be constants: them can be calculated dynamically during query execution.
So at the time of query compilation it is not possible to check whether
specified timestamps observe catalog changes or not.

Konstantin Knizhnik
Postgres Professional:
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2018-01-10 09:34:32 Re: AS OF queries
Previous Message Fabien COELHO 2018-01-10 08:54:13 Re: [HACKERS] pgbench - allow to store select results into variables