Re: Global temporary tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global temporary tables
Date: 2019-09-23 16:50:19
Message-ID: CAFj8pRAej_Jpw3S5nnJYQZAwCa8MYMY1-3q0-n64eeqFFJykRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 23. 9. 2019 v 9:57 odesílatel Konstantin Knizhnik <
k(dot)knizhnik(at)postgrespro(dot)ru> napsal:

>
>
> On 20.09.2019 19:43, Pavel Stehule wrote:
>
>
> 1. I do not need them at all.
>> 2. Eliminate catalog bloating.
>> 3. Mostly needed for compatibility with Oracle (simplify porting,...).
>> 4. Parallel query execution.
>> 5. Can be used at replica.
>> 6. More efficient use of resources (first of all memory).
>>
>
> There can be other point important for cloud. Inside some cloud usually
> there are two types of discs - persistent (slow) and ephemeral (fast). We
> effectively used temp tables there because we moved temp tablespace to
> ephemeral discs.
>
>
> Yes, I already heard this argument and agree with it.
> I just want to notice two things:
> 1. My assumption is that in most cases data of temporary table can fit in
> memory (certainly if we are not limiting them by temp_buffers = 8MB, but
> store in shared buffers) and so there is on need to write them to the
> persistent media at all.
> 2. Global temp tables do not substitute local temp tables, accessed
> through local buffers. So if you want to use temporary storage, you will
> always have a way to do it.
> The question is whether we need to support two kinds of global temp tables
> (with shared or private buffers) or just implement one of them.
>

It's valid only for OLTP. OLAP world is totally different. More if all
users used temporary tables, and you should to calculate with it - it is
one reason for global temp tables, then you need multiply size by
max_connection.

hard to say what is best from implementation perspective, but it can be
unhappy if global temporary tables has different performance
characteristics and configuration than local temporary tables.

>
>
> I missing one point in your list - developer's comfort - using temp tables
> is just much more comfortable - you don't need create it again, again, ..
> Due this behave is possible to reduce @2 and @3 can be nice side effect. If
> you reduce @2 to zero, then @5 should be possible without any other.
>
> Sorry, I do not completely understand your point here
> You can use normal (permanent) table and you will not have to create them
> again and again. It is also possible to use them for storing temporary data
> - just need to truncate table when data is not needed any more.
> Certainly you can not use the same table in more than one backend. Here is
> the main advantage of temp tables - you can have storage of per-session
> data and do not worry about possible name conflicts.
>

You use temporary tables because you know so you share data between session
never. I don't remember any situation when I designed temp tables with
different schema for different sessions.

Using global temp table is not effective - you are work with large tables,
you need to use delete, .. so you cannot to use classic table like temp
tables effectively.

> From the other side: there are many cases where format of temporary data
> is not statically known: it is determined dynamically during program
> execution.
> In this case local temp table provides the most convenient mechanism for
> working with such data.
>
> This is why I think that ewe need to have both local and global temp
> tables.
>
> Also I do not agree with your statement "If you reduce @2 to zero, then @5
> should be possible without any other".
> In the solution implemented by Aleksander Alekseev metadata of temporary
> tables is kept in memory and not affecting catalog at all.
> But them still can not be used at replica.
> There are still some serious problems which need to be fixed to able it:
> allow insert/update/delete statements for read-only transactions, somehow
> assign XIDs for them, implement savepoints and rollback of such
> transactions.
> All this was done in the last version of my patch.
> Yes, it doesn't depend on whether we are using shared or private buffers
> for temporary tables. The same approach can be implemented for both of them.
> The question is whether we are really need temp tables at replica and if
> so, do we need full transaction support for them, including rollbacks,
> subtransactions.
>

temporary tables (of any type) on replica is interesting feature that opens
some possibilities. Some queries cannot be optimized and should be divided
and some results should be stored to temporary tables, analysed (to get
correct statistics), maybe indexed, and after that the calculation can
continue. Now you can do this just only on master. More - on HotStandBy the
data are read only, and without direct impact on master (production), so
you can do some harder calculation there. And temporary tables is used
technique how to fix estimation errors.

I don't think so subtransaction, transaction, rollbacks are necessary for
these tables. On second hand with out it, it is half cooked features, and
can looks pretty strange in pg environment.

I am very happy, how much work you do in this area, I had not a courage to
start this job, but I don't think so this work can be reduced just to some
supported scenarios - and I hope so correct implementation is possible -
although it is not simply work.

> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Steven Pousty 2019-09-23 16:52:24 Re: JSONPATH documentation
Previous Message Alvaro Herrera 2019-09-23 16:45:29 Re: [proposal] de-TOAST'ing using a iterator