Re: [Proposal] Global temporary tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2020-01-27 19:44:13
Message-ID: CAFj8pRDVaNQhg1juJKuj3A2fN-TW5LsMVCqyOkEhKKjbjHEk5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 27. 1. 2020 v 10:11 odesílatel Konstantin Knizhnik <
k(dot)knizhnik(at)postgrespro(dot)ru> napsal:

>
>
> On 24.01.2020 22:39, Pavel Stehule wrote:
>
> I cannot to evaluate your proposal, and I am sure, so you know more about
> this code.
>
> There is a question if we can allow to build local temp index on global
> temp table. It is different situation. When I work with global properties
> personally I prefer total asynchronous implementation of any DDL operations
> for other than current session. When it is true, then I have not any
> objection. For me, good enough design of any DDL can be based on catalog
> change without forcing to living tables.
>
>
> From my point of view there are two difference uses cases of temp tables:
> 1. Backend needs some private data source which is specific to this
> session and has no relation with activities of other sessions.
> 2. We need a table containing private session data, but which is used in
> the same way by all database users.
>
> In the first case current Postgres temp tables works well (if we forget
> for a moment about all known issues related with temp tables).
> Global temp tables are used to address the second scenario. Assume that
> we write some stored procedure or implement some business logic outside
> database and
> what to perform some complex analtic query which requires tepmp table for
> storing intermediate results. In this case we can create GTT with all
> needed index at the moment of database initialization
> and do not perform any extra DDL during query execution. If will prevent
> catalog bloating and makes execution of query more efficient.
>
> I do not see any reasons to allow build local indexes for global table.
> Yes,it can happen that some session will have small amount of data in
> particular GTT and another - small amount of data in this table. But if
> access pattern is the same (and nature of GTT assumes it), then index in
> either appreciate, either useless in both cases.
>
>
>
>
> I see following disadvantage of your proposal. See scenario
>
> 1. I have two sessions
>
> A - small GTT with active owner
> B - big GTT with some active application.
>
> session A will do new index - it is fast, but if creating index is forced
> on B on demand (when B was touched), then this operation have to wait after
> index will be created.
>
> So I afraid build a index on other sessions on GTT when GTT tables in
> other sessions will not be empty.
>
>
>
> Yes, it is true. But is is not the most realistic scenario from my point
> of view.
> As I explained above, GTT should be used when we need temporary storage
> accessed in the same way by all clients.
> If (as with normal tables) at some moment of time DBA realizes, that
> efficient execution of some queries needs extra indexes,
> then it should be able to do it. It is very inconvenient and unnatural to
> prohibit DBA to do it until all sessions using this GTT are closed (it may
> never happen)
> or require all sessions to restart to be able to use this index.
>
> So it is possible to imagine different scenarios of working with GTTs.
> But from my point of view the only non-contradictory model of their
> behavior is to make it compatible with normal tables.
> And do not forget about compatibility with Oracle. Simplifying of porting
> existed applications from Oracle to Postgres may be the
> main motivation of adding GTT to Postgres. And making them incompatible
> with Oracle will be very strange.
>

I don't think so compatibility with Oracle is valid point in this case. We
need GTT, but the mechanism of index building should be designed for
Postgres, and for users.

Maybe the method proposed by you can be activated by some option like
CREATE INDEX IMMEDIATELY FOR ALL SESSION. When you use GTT without index,
then
it should to work some time more, and if you use short life sessions, then
index build can be last or almost last operation over table and can be
suboptimal.

Anyway, this behave can be changed later without bigger complications - and
now I am have strong opinion to prefer don't allow to any DDL (with index
creation) on any active GTT in other sessions.
Probably your proposal - build indexes on other sessions when GTT is
touched can share code with just modify metadata and wait on session reset
or GTT reset

Usually it is not hard problem to refresh sessions, and what I know when
you update plpgsql code, it is best practice to refresh session early.

>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2020-01-27 20:05:44 Re: making the backend's json parser work in frontend code
Previous Message Stephen Frost 2020-01-27 19:26:42 Re: our checks for read-only queries are not great