Re: [Proposal] Global temporary tables

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2020-01-16 07:23:33
Message-ID: 73e8f0f1-62b5-0a97-5802-a98ac9d07a33@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.01.2020 16:10, 曾文旌(义从) wrote:
>
>> I do not see principle difference here with scenario when 50 sessions create (local) temp table,
>> populate it with GB of data and create index for it.
> I think the problem is that when one session completes the creation of the index on GTT,
> it will trigger the other sessions build own local index of GTT in a centralized time.
> This will consume a lot of hardware resources (cpu io memory) in a short time,
> and even the database service becomes slow, because 50 sessions are building index.
> I think this is not what we expected.

First of all creating index for GTT ni one session doesn't immediately
initiate building indexes in all other sessions.
Indexes are built on demand. If session is not using this GTT any more,
then index for it will not build at all.
And if GTT is really are actively used by all sessions, then building
index and using it for constructing optimal execution plan is better,
then continue to  use sequential scan and read all GTT data from the disk.

And as I already mentioned I do not see some principle difference in
aspect of resource consumptions comparing with current usage of local
temp tables.
If we have have many sessions, each creating temp table, populating it
with data and building index for it, then we will
observe the same CPU utilization and memory resource consumption as in
case of using GTT and creating index for it.

Sorry, but I still not convinced by your and Tomas arguments.
Yes, building GTT index may cause high memory consumption
(maintenance_work_mem * n_backends).
But such consumption can be  observed also without GTT and it has to be
taken in account when choosing value for maintenance_work_mem.
But from my point of view it is much more important to make behavior of
GTT as much compatible with normal tables as possible.
Also from database administration point of view, necessity to restart
sessions to make then use new indexes seems to be very strange and
inconvenient.
Alternatively DBA can address the problem with high memory consumption
by adjusting maintenance_work_mem, so this solution is more flexible.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-01-16 07:27:27 Re: Expose lock group leader pid in pg_stat_activity
Previous Message Kyotaro Horiguchi 2020-01-16 06:50:02 Re: Append with naive multiplexing of FDWs