Re: [Proposal] Global temporary tables

From: 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, Cai, Le <le(dot)cai(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2019-11-08 07:50:42
Message-ID: F687C0B8-AE94-4A65-9AD9-BB83B874F425@alibaba-inc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2019年11月8日 上午12:32,Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> 写道:
>
>
>
> On 07.11.2019 12:30, 曾文旌(义从) wrote:
>>
>>> May be the assumption is that all indexes has to be created before GTT start to be used.
>> Yes, Currently, GTT's index is only supported and created in an empty table state, and other sessions are not using it.
>> There has two improvements pointer:
>> 1 Index can create on GTT(A) when the GTT(A) in the current session is not empty, requiring the GTT table to be empty in the other session.
>> Index_build needs to be done in the current session just like a normal table. This improvement is relatively easy.
>>
>> 2 Index can create on GTT(A) when more than one session are using this GTT(A).
>> Because when I'm done creating an index of the GTT in this session and setting it to be an valid index, it's not true for the GTT in other sessions.
>> Indexes on gtt in other sessions require "rebuild_index" before using it.
>> I don't have a better solution right now, maybe you have some suggestions.
> It is possible to create index on demand:
>
> Buffer
> _bt_getbuf(Relation rel, BlockNumber blkno, int access)
> {
> Buffer buf;
>
> if (blkno != P_NEW)
> {
> /* Read an existing block of the relation */
> buf = ReadBuffer(rel, blkno);
> /* Session temporary relation may be not yet initialized for this backend. */
> if (blkno == BTREE_METAPAGE && GlobalTempRelationPageIsNotInitialized(rel, BufferGetPage(buf)))
> {
> Relation heap = RelationIdGetRelation(rel->rd_index->indrelid);
> ReleaseBuffer(buf);
> DropRelFileNodeLocalBuffers(rel->rd_node, MAIN_FORKNUM, blkno);
> btbuild(heap, rel, BuildIndexInfo(rel));
> RelationClose(heap);
> buf = ReadBuffer(rel, blkno);
> LockBuffer(buf, access);
> }
> else
> {
> LockBuffer(buf, access);
> _bt_checkpage(rel, buf);
> }
> }
> ...
In my opinion, it is not a good idea to trigger a btbuild with a select or DML, the cost of which depends on the amount of data in the GTT.

>
>
> This code initializes B-Tree and load data in it when GTT index is access and is not initialized yet.
> It looks a little bit hacker but it works.
>
> I also wonder why you are keeping information about GTT in shared memory. Looks like the only information we really need to share is table's metadata.
> But it is already shared though catalog. All other GTT related information is private to backend so I do not see reasons to place it in shared memory.
The shared hash structure tracks which backend has initialized the GTT storage in order to implement the DDL of the GTT.
As for GTT, there is only one definition(include index on GTT), but each backend may have one data.
For the implementation of drop GTT, I assume that all data and definitions need to be deleted.

>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>

Attachment Content-Type Size
global_temporary_table_v2-pg13.patch application/octet-stream 136.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-11-08 07:54:11 Re: Refactor parse analysis of EXECUTE command
Previous Message Pavel Stehule 2019-11-08 07:38:57 Re: Refactor parse analysis of EXECUTE command