Re: Global temporary tables

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global temporary tables
Date: 2019-09-23 07:57:08
Message-ID: 6d546a91-0faa-7335-4ceb-f0022855fc7d@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20.09.2019 19:43, Pavel Stehule wrote:
>
> 1. I do not need them at all.
> 2. Eliminate catalog bloating.
> 3. Mostly needed for compatibility with Oracle (simplify porting,...).
> 4. Parallel query execution.
> 5. Can be used at replica.
> 6. More efficient use of resources (first of all memory).
>
>
> There can be other point important for cloud. Inside some cloud
> usually there are two types of discs - persistent (slow) and ephemeral
> (fast). We effectively used temp tables there because we moved temp
> tablespace to ephemeral discs.

Yes, I already heard this argument and agree with it.
I just want to notice two things:
1. My assumption is that in most cases data of temporary table can fit
in memory (certainly if we are not limiting them by temp_buffers = 8MB,
but store in shared buffers) and so there is on need to write them to
the persistent media at all.
2. Global temp tables do not substitute local temp tables, accessed
through local buffers. So if you want to use temporary storage, you will
always have a way to do it.
The question is whether we need to support two kinds of global temp
tables (with shared or private buffers) or just implement one of them.

>
> I missing one point in your list - developer's comfort - using temp
> tables is just much more comfortable - you don't need create it again,
> again, .. Due this behave is possible to reduce @2 and @3 can be nice
> side effect. If you reduce @2 to zero, then @5 should be possible
> without any other.
>
Sorry, I do not completely understand your point here
You can use normal (permanent) table and you will not have to create
them again and again. It is also possible to use them for storing
temporary data - just need to truncate table when data is not needed any
more.
Certainly you can not use the same table in more than one backend. Here
is the main advantage of temp tables - you can have storage of
per-session data and do not worry about possible name conflicts.

From the other side: there are many cases where format of temporary
data is not statically known: it is determined dynamically during
program execution.
In this case local temp table provides the most convenient mechanism for
working with such data.

This is why I think that ewe need to have both local and global temp tables.

Also I do not agree with your statement "If you reduce @2 to zero, then
@5 should be possible without any other".
In the solution implemented by Aleksander Alekseev metadata of temporary
tables is kept in memory and not affecting catalog at all.
But them still can not be used at replica.
There are still some serious problems which need to be fixed to able it:
allow insert/update/delete statements for read-only transactions,
somehow assign XIDs for them, implement savepoints and rollback of such
transactions.
All this was done in the last version of my patch.
Yes, it doesn't depend on whether we are using shared or private buffers
for temporary tables. The same approach can be implemented for both of them.
The question is whether we are really need temp tables at replica and if
so, do we need full transaction support for them, including rollbacks,
subtransactions.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tels 2019-09-23 08:28:09 Re: Efficient output for integer types
Previous Message Thunder 2019-09-23 07:48:50 Re:PATCH: standby crashed when replay block which truncated in standby but failed to truncate in master node