Re: [Proposal] Global temporary tables

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2019-11-11 15:19:21
Message-ID: 6b7570aa-14ea-2f28-c3b5-74dd785bbafe@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08.11.2019 18:06, 曾文旌(义从) wrote:
> My comments for global_private_temp-4.patch

Thank you very much for inspecting my patch.
>
> good side:
> 1 Lots of  index type on GTT. I think we need support for all kinds of
> indexes.
> 2 serial column on GTT.
> 3 INHERITS GTT.
> 4 PARTITION GTT.
>
> I didn't choose to support them in the first release, but you did.
>
> Other side:
> 1 case: create global temp table gtt2(a int primary key, b text) on
> commit delete rows;
> I think you've lost the meaning of the on commit delete rows clause.
> After the GTT is created, the other sessions feel that this is an on
> commit PRESERVE rows GTT.
>

Yes, there was bug in my implementation of ON COMMIT DELETE ROWS for GTT.
It is fixed in global_private_temp-6.patch

> 2 truncate gtt, mybe this is a bug in DropRelFileNodeBuffers.
> GTT's local buffer is not released.
> Case:
> postgres=# insert into gtt2 values(1,'xx');
> INSERT 0 1
> postgres=# truncate gtt2;
> TRUNCATE TABLE
> postgres=# insert into gtt2 values(1,'xx');
> ERROR:  unexpected data beyond EOF in block 0 of relation
> base/13579/t3_16384
> HINT:  This has been seen to occur with buggy kernels; consider
> updating your system.
>

Yes another bug, also fixed in new version of the patch.

> 3  lock type of truncate GTT.
> I don't think it's a good idea to hold a big lock with truncate GTT,
> because it only needs to process private data.

Sorry, I do not understand which lock you are talking about.
I have not introduced any special locks for GTT.

> 4 GTT's ddl Those ddl that need to rewrite data files may need attention.
> We have discussed in the previous email. This is why I used shared
> hash to track the GTT file.
>

You are right.
But instead of prohibiting ALTER TABLE at all for GTT, we can check
that there are no other backends using it.
I do not think that we should maintain some hash in shared memory to
check it.
As far as ALTER TABLE is rare and slow operation in any case, we can
just check presence of GTT files
created by other backends.
I have implemented this check in global_private_temp-6.patch

>
> 5 There will be problems with DDL that will change relfilenode. Such
> as cluster GTT ,vacuum full GTT.
> A session completes vacuum full gtt(a), and other sessions will
> immediately start reading and writing new storage files and existing
> data is also lost.
> I disable them in my current version.

Thank you for noticing it.
Autovacuum full should really be prohibited for GTT.

>
> 6 drop GTT
> I think drop GTT should clean up all storage files and definitions.
> How do you think?
>
Storage files will be cleaned in any case on backend termination.
Certainly if backend creates  and deletes huge number of GTT in the
loop, it can cause space exhaustion.
But it seems to be very strange pattern of GTT usage.

> 7 MVCC visibility clog clean
> GTT data visibility rules, like regular tables, so GTT also need clog.
> We need to avoid the clog that GTT needs to be cleaned up.
> At the same time, GTT does not do autovacuum, and retaining "too old
> data" will cause wraparound data loss.
> I have given a solution in my design.
>
But why do we need some special handling of visibility rules for GTT
comparing with normal (local) temp tables?
Them are also not proceeded by autovacuum?

In principle, I have also implemented special visibility rules for GTT,
but only for the case when them
are accessed at replica. And it is not included in this patch, because
everybody think that access to GTT
replica should be considered in separate patch.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2019-11-11 15:25:56 Re: Attempt to consolidate reading of XLOG page
Previous Message Alvaro Herrera 2019-11-11 14:56:49 Re: FETCH FIRST clause WITH TIES option