Re: [Proposal] Global temporary tables

From: 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Erik Rijkers" <er(at)xs4all(dot)nl>, "Tomas Vondra" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Julien Rouhaud" <rjuju123(at)gmail(dot)com>, "Dean Rasheed" <dean(dot)a(dot)rasheed(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, Cai, Le <le(dot)cai(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2020-02-04 15:01:37
Message-ID: 1C648A29-C88F-45B1-BE66-84B51D319CAD@alibaba-inc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2020年2月3日 下午4:16,Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> 写道:
>
>
>
> On 01.02.2020 19:14, 曾文旌(义从) wrote:
>>
>>
>>> 2020年1月27日 下午5:38,Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>> 写道:
>>>
>>>
>>>
>>> On 25.01.2020 18:15, 曾文旌(义从) wrote:
>>>> I wonder why do we need some special check for GTT here.
>>>>> From my point of view cleanup at startup of local storage of temp tables should be performed in the same way for local and global temp tables.
>>>> After oom kill, In autovacuum, the Isolated local temp table will be cleaned like orphan temporary tables. The definition of local temp table is deleted with the storage file.
>>>> But GTT can not do that. So we have the this implementation in my patch.
>>>> If you have other solutions, please let me know.
>>>>
>>> I wonder if it is possible that autovacuum or some other Postgres process is killed by OOM and postmaster is not noticing it can doens't restart Postgres instance?
>>> as far as I know, crash of any process connected to Postgres shared memory (and autovacuum definitely has such connection) cause Postgres restart.
>> Postmaster will not restart after oom happen, but the startup process will. GTT data files are cleaned up in the startup process.
>
> Yes, exactly.
> But it is still not clear to me why do we need some special handling for GTT?
> Shared memory is reinitialized and storage of temporary tables is removed.
> It is true for both local and global temp tables.
Of course not. The local temp table cleans up the entire table (including catalog buffer and datafile). GTT is not.

>
>>>
>>>
>>>> In my design
>>>> 1 Because different sessions have different transaction information, I choose to store the transaction information of GTT in MyProc,not catalog.
>>>> 2 About the XID wraparound problem, the reason is the design of the temp table storage(local temp table and global temp table) that makes it can not to do vacuum by autovacuum.
>>>> It should be completely solve at the storage level.
>>>>
>>>
>>> My point of view is that vacuuming of temp tables is common problem for local and global temp tables.
>>> So it has to be addressed in the common way and so we should not try to fix this problem only for GTT.
>> I think I agree with you this point.
>> However, this does not mean that GTT transaction information stored in pg_class is correct.
>> If you keep it that way, like in global_private_temp-8.patch, It may cause data loss in GTT after aotuvauum.
>
> In my patch autovacuum is prohibited for GTT.
But vacuum GTT is not prohibited.

>
>> IMHO forced terminated of client sessions is not acceptable solution.
>>> And it is not an absolutely necessary requirement.
>>> So from my point of view we should not add such limitations to GTT design.
>> This limitation makes it possible for the GTT to do all the DDL.
>> IMHO even oracle's GTT has similar limitations.
>
> I have checked that Oracle is not preventing creation of index for GTT if there are some active sessions working with this table. And this index becomes visible for all this sessions.
1 Yes The creation of inde gtt has been improved in global_temporary_table_v10-pg13.patch
2 But alter GTT ; drop GTT ; drop index on GTT is blocked by other sessions

SQL> drop table gtt;
drop table gtt
*
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already
in use

SQL> ALTER TABLE gtt add b int ;
ALTER TABLE gtt add b int
*
ERROR at line 1:
ORA-14450: attempt to access a transactional temp table already in use

SQL> drop index idx_gtt;
drop index idx_gtt
*
ERROR at line 1:
ORA-14452: attempt to create, alter or drop an index on temporary table already
in use

I'm not saying we should do this, but from an implementation perspective we face similar issues.
If a dba changes a GTT, he can do it. Therefore, I think it is acceptable to do so.

>
>
>> As global_private_temp-8.patch, think about:
>> 1 session X tale several hours doing some statistical work with the GTT A, which generated some data using transaction 100, The work is not over.
>> 2 Then session Y vacuumed A, and the GTT's relfrozenxid (in pg_class) was updated to 1000 0000.
>> 3 Then the aotuvacuum happened, the clog before 1000 0000 was cleaned up.
>> 4 The data in session A could be lost due to missing clog, The analysis task failed.
>>
>> However This is likely to happen because you allowed the GTT do vacuum.
>> And this is not a common problem, that not happen with local temp tables.
>> I feel uneasy about leaving such a question. We can improve it.
>>
>
> May be the easies solution is to prohibit explicit vacuum of GTT?
I think vacuum is an important part of GTT.

Looking back at previous emails, robert once said that vacuum GTT is pretty important.
https://www.postgresql.org/message-id/CA%2BTgmob%3DL1k0cpXRcipdsaE07ok%2BOn%3DtTjRiw7FtD_D2T%3DJwhg%40mail.gmail.com <https://www.postgresql.org/message-id/CA+Tgmob=L1k0cpXRcipdsaE07ok+On=tTjRiw7FtD_D2T=Jwhg@mail.gmail.com>

>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com <http://www.postgrespro.com/>
> The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2020-02-04 16:10:15 Re: Memory-Bounded Hash Aggregation
Previous Message Julien Rouhaud 2020-02-04 14:27:25 Re: Expose lock group leader pid in pg_stat_activity