Re: [Proposal] Global temporary tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, wenjing(at)gmail(dot)com, Andrew Bille <andrewbille(at)gmail(dot)com>, Tony Zhu <Tony(dot)zhu(at)ww-it(dot)cn>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Global temporary tables
Date: 2022-02-27 05:09:54
Message-ID: CAFj8pRANfAciHwDwNyH3-QkkNPFFDRuBfFamYGaXWZKsW4HCCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ne 27. 2. 2022 v 5:13 odesílatel Andres Freund <andres(at)anarazel(dot)de> napsal:

> Hi,
>
> On 2022-02-27 04:17:52 +0100, Pavel Stehule wrote:
> > > You redirect stats from pg_class and pg_statistics to a local hash
> table.
> > > This is pretty hairy :(
>
> As is I think the patch is architecturally completely unacceptable. Having
> code everywhere to redirect to manually written in-memory catalog table
> code
> isn't maintainable.
>
>
> > > I guess you'd also need to handle pg_statistic_ext and ext_data.
> > > pg_stats doesn't work, since the data isn't in pg_statistic - it'd
> need to
> > > look
> > > at pg_get_gtt_statistics.
> >
> > Without this, the GTT will be terribly slow like current temporary tables
> > with a lot of problems with bloating of pg_class, pg_attribute and
> > pg_depend tables.
>
> I think it's not a great idea to solve multiple complicated problems at
> once...
>

I thought about this issue for a very long time, and I didn't find any
better (without more significant rewriting of pg storage). In a lot of
projects, that I know, the temporary tables are strictly prohibited due
possible devastating impact to system catalog bloat. It is a serious
problem. So any implementation of GTT should solve the questions: a) how to
reduce catalog bloating, b) how to allow session related statistics for
GTT. I agree so implementation of GTT like template based LTT (local
temporary tables) can be very simple (it is possible by extension), but
with the same unhappy performance impacts.

I don't say so current design should be accepted without any discussions
and without changes. Maybe GTT based on LTT can be better than nothing
(what we have now), and can be good enough for a lot of projects where the
load is not too high (and almost all projects have low load).
Unfortunately,it can be a trap for a lot of projects in future, so there
should be discussion and proposed solutions for fix of related issues. The
performance of GTT should be fixable, so any discussion about this topic
should have part about protections against catalog bloat and about cost
related to frequent catalog updates.

But anyway, I invite (and probably not just me) any discussion on how to
implement this feature, how to solve performance issues, and how to divide
implementation into smaller steps. I am sure so fast GTT implementation
can be used for fast implementation of LTT too, and maybe with all other
temporary objects

Regards

Pavel

> Greetings,
>
> Andres Freund
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-02-27 07:20:25 Add WAL recovery messages with log_wal_traffic GUC (was: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display)
Previous Message Andres Freund 2022-02-27 04:13:04 Re: [Proposal] Global temporary tables