Re: [Proposal] Global temporary tables

From: 曾文旌 <wenjing(dot)zwj(at)alibaba-inc(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "蔡松露(子嘉)" <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2020-04-26 08:09:02
Message-ID: 318BAF11-FE5A-4CE0-9286-3695F9AD0A8A@alibaba-inc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2020年4月23日 下午3:43,Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> 写道:
>
>
>
> čt 23. 4. 2020 v 9:10 odesílatel 曾文旌 <wenjing(dot)zwj(at)alibaba-inc(dot)com <mailto:wenjing(dot)zwj(at)alibaba-inc(dot)com>> napsal:
>
>
>> 2020年4月22日 下午10:50,Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com <mailto:pavel(dot)stehule(at)gmail(dot)com>> 写道:
>>
>>
>>
>> st 22. 4. 2020 v 16:38 odesílatel Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com <mailto:prabhat(dot)sahu(at)enterprisedb(dot)com>> napsal:
>>
>>
>> On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing(dot)zwj(at)alibaba-inc(dot)com <mailto:wenjing(dot)zwj(at)alibaba-inc(dot)com>> wrote:
>>>
>>> Although the implementation of GTT is different, I think so TRUNCATE on Postgres (when it is really finalized) can remove session metadata of GTT too (and reduce usage's counter). It is not critical feature, but I think so it should not be hard to implement. From practical reason can be nice to have a tool how to refresh GTT without a necessity to close session. TRUNCATE can be this tool.
> Sorry, I don't quite understand what you mean, could you describe it in detail?
> In my opinion the TRUNCATE GTT cannot clean up data in other sessions, especially clean up local buffers in other sessions.
>
> It is about a possibility to force reset GTT to default empty state for all sessions.
>
> Maybe it is some what does your TRUNCATE DROP, but I don't think so this design (TRUNCATE DROP) is good, because then user have to know implementation detail.
>
> I prefer some like TRUNCATE tab WITH OPTION (GLOBAL, FORCE) - "GLOBAL" .. apply on all sessions, FORCE try to do without waiting on some global lock, try to do immediately with possibility to cancel some statements and rollback some session.
>
> instead GLOBAL maybe we can use "ALLSESSION", or "ALL SESSION" or some else
>
> But I like possible terminology LOCAL x GLOBAL for GTT. What I mean? Some statements like "TRUNCATE" can works (by default) in "local" mode .. it has impact to current session only. But sometimes can be executed in "global" mode with effect on all sessions.
The TRUNCATE GTT GLOBAL like DROP GTT FORCE you mentioned that before.
I think this requires identifying sessions that have initialized the stored file and no actual data.
And Handling local buffers on other session and locks is also difficult.
It may be harder than dropping the GTT force, which can kill other sessions, but TRUNCATE GTT would prefer not to.
This doesn't seem to complete the basic conditions, it's not easy.
So, I want to put this feature in next releases, along with DROP GTT FORCE.
Also, in view of your comments, I roll back the feature of TRUNCATE GTT DROP.

Wenjing

>
>
>
>> Yes, I think we need a way to delete the GTT local storage without closing the session.
>>
>> I provide the TRUNCATE tablename DROP to clear the data in the GTT and delete the storage files.
>> This feature requires the current transaction to commit immediately after it finishes truncate.
>>
>> Hi Wenjing,
>> Thanks for the patch(v30) for the new syntax support for (TRUNCATE table_name DROP) for deleting storage files after TRUNCATE on GTT.
>>
>> This syntax looks strange, and I don't think so it solve anything in practical life, because without lock the table will be used in few seconds by other sessions.
>
> If a dba wants to delete or modify a GTT, he can use locks to help him make the change.
>
> postgres=# begin;
> BEGIN
> postgres=*# LOCK TABLE gtt2 IN ACCESS EXCLUSIVE MODE;
> postgres=*# select * from pg_gtt_attached_pids ;
>
> Kill session or let session do TRUNCATE tablename DROP
>
> postgres=*# drop table gtt2;
> DROP TABLE
> postgres=*# commit;
> COMMIT
>
> yes, user can lock a tables. But I think so it is user friendly design. I don't remember any statement in Postgres, where I have to use table locks explicitly.
>
> For builtin commands it should be done transparently (for user).
It can be improved ,like DROP GTT FORCE.

>
> Regards
>
> Pavel
>
>
>>
>> This is same topic when we talked about ALTER - when and where the changes should be applied.
>>
>> The CLUSTER commands works only on session private data, so it should not to need some special lock or some special cleaning before.
>>
>> Regards
>>
>> Pavel
>>
>>
>> Please check below scenarios:
>>
>> Case1:
>> -- session1:
>> postgres=# create global temporary table gtt2 (c1 integer) on commit preserve rows;
>> CREATE TABLE
>> postgres=# create index idx1 on gtt2 (c1);
>> CREATE INDEX
>> postgres=# create index idx2 on gtt2 (c1) where c1%2 =0;
>> CREATE INDEX
>> postgres=#
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> ERROR: cannot cluster on partial index "idx2"
>>
>> Case2:
>> -- Session2:
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> CLUSTER
>>
>> postgres=# insert into gtt2 values(1);
>> INSERT 0 1
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> ERROR: cannot cluster on partial index "idx2"
>>
>> Case3:
>> -- Session2:
>> postgres=# TRUNCATE gtt2 DROP;
>> TRUNCATE TABLE
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> CLUSTER
>>
>> In Case2, Case3 we can observe, with the absence of data in GTT, we are able to "CLUSTER gtt2 USING idx2;" (having partial index)
>> But why does the same query fail for Case1 (absence of data)?
>>
>> Thanks,
>> Prabhat Sahu
>>
>>
>>
>>
>> Wenjing
>>
>>
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>> All in all, I think the current implementation is sufficient for dba to manage GTT.
>>>
>>>> 2020年4月2日 下午4:45,Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com <mailto:prabhat(dot)sahu(at)enterprisedb(dot)com>> 写道:
>>>>
>>>> Hi All,
>>>>
>>>> I have noted down few behavioral difference in our GTT implementation in PG as compared to Oracle DB:
>>>> As per my understanding, the behavior of DROP TABLE in case of "Normal table and GTT" in Oracle DB are as below:
>>>> Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
>>>> For a completed transaction on a normal table having data, we will be able to DROP from another session. If the transaction is not yet complete, and we are trying to drop the table from another session, then we will get an error. (working as expected)
>>>> For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
>>>> For a completed transaction on GTT with(on commit preserve rows) with data in a session, we will not be able to DROP from any session(not even from the session in which GTT is created), we need to truncate the table data first from all the session(session1, session2) which is having data.
>>>> 1. Any tables(Normal table / GTT) without having data in a session, we will be able to DROP from another session.
>>>> Session1:
>>>> create table t1 (c1 integer);
>>>> create global temporary table gtt1 (c1 integer) on commit delete rows;
>>>> create global temporary table gtt2 (c1 integer) on commit preserve rows;
>>>>
>>>> Session2:
>>>> drop table t1;
>>>> drop table gtt1;
>>>> drop table gtt2;
>>>>
>>>> -- Issue 1: But we are able to drop a simple table and failed to drop GTT as below.
>>>> postgres=# drop table t1;
>>>> DROP TABLE
>>>> postgres=# drop table gtt1;
>>>> ERROR: can not drop relation gtt1 when other backend attached this global temp table
>>>> postgres=# drop table gtt2;
>>>> ERROR: can not drop relation gtt2 when other backend attached this global temp table
>>>>
>>>> 3. For a completed transaction on GTT with(on commit delete rows) (i.e. no data in GTT) in a session, we will be able to DROP from another session.
>>>> Session1:
>>>> create global temporary table gtt1 (c1 integer) on commit delete rows;
>>>>
>>>> Session2:
>>>> drop table gtt1;
>>>>
>>>> -- Issue 2: But we are getting error for GTT with(on_commit_delete_rows) without data.
>>>> postgres=# drop table gtt1;
>>>> ERROR: can not drop relation gtt1 when other backend attached this global temp table
>>>>
>>>> 4. For a completed transaction on GTT with(on commit preserve rows) with data in any session, we will not be able to DROP from any session(not even from the session in which GTT is created)
>>>>
>>>> Case1:
>>>> create global temporary table gtt2 (c1 integer) on commit preserve rows;
>>>> insert into gtt2 values(100);
>>>> drop table gtt2;
>>>>
>>>> SQL> drop table gtt2;
>>>> drop table gtt2
>>>> *
>>>> ERROR at line 1:
>>>> ORA-14452: attempt to create, alter or drop an index on temporary table already in use
>>>>
>>>> -- Issue 3: But, we are able to drop the GTT(having data) which we have created in the same session.
>>>> postgres=# drop table gtt2;
>>>> DROP TABLE
>>>>
>>>> Case2: GTT with(on commit preserve rows) having data in both session1 and session2
>>>> Session1:
>>>> create global temporary table gtt2 (c1 integer) on commit preserve rows;
>>>> insert into gtt2 values(100);
>>>>
>>>> Session2:
>>>> insert into gtt2 values(200);
>>>>
>>>> -- If we try to drop the table from any session we should get an error, it is working fine.
>>>> drop table gtt2;
>>>> SQL> drop table gtt2;
>>>> drop table gtt2
>>>> *
>>>> ERROR at line 1:
>>>> ORA-14452: attempt to create, alter or drop an index on temporary table already in use
>>>>
>>>> postgres=# drop table gtt2 ;
>>>> ERROR: can not drop relation gtt2 when other backend attached this global temp table
>>>>
>>>> -- To drop the table gtt2 from any session1/session2, we need to truncate the table data first from all the session(session1, session2) which is having data.
>>>> Session1:
>>>> truncate table gtt2;
>>>> -- Session2:
>>>> truncate table gtt2;
>>>>
>>>> Session 2:
>>>> SQL> drop table gtt2;
>>>>
>>>> Table dropped.
>>>>
>>>> -- Issue 4: But we are not able to drop the GTT, even after TRUNCATE the table in all the sessions.
>>>> -- truncate from all sessions where GTT have data.
>>>> postgres=# truncate gtt2 ;
>>>> TRUNCATE TABLE
>>>>
>>>> -- try to DROP GTT still, we are getting error.
>>>> postgres=# drop table gtt2 ;
>>>> ERROR: can not drop relation gtt2 when other backend attached this global temp table
>>>>
>>>> To drop the GTT from any session, we need to exit from all other sessions.
>>>> postgres=# drop table gtt2 ;
>>>> DROP TABLE
>>>>
>>>> Kindly let me know if I am missing something.
>>>>
>>>>
>>>> On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com <mailto:prabhat(dot)sahu(at)enterprisedb(dot)com>> wrote:
>>>> Hi Wenjing,
>>>> I hope we need to change the below error message.
>>>>
>>>> postgres=# create global temporary table gtt(c1 int) on commit preserve rows;
>>>> CREATE TABLE
>>>>
>>>> postgres=# create materialized view mvw as select * from gtt;
>>>> ERROR: materialized views must not use global temporary tables or views
>>>>
>>>> Anyways we are not allowed to create a "global temporary view",
>>>> so the above ERROR message should change(i.e. " or view" need to be removed from the error message) something like:
>>>> "ERROR: materialized views must not use global temporary tables"
>>>>
>>>> --
>>>> With Regards,
>>>> Prabhat Kumar Sahu
>>>> EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>
>>>>
>>>>
>>>> --
>>>> With Regards,
>>>> Prabhat Kumar Sahu
>>>> EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>
>>>
>>
>>
>>
>> --
>> With Regards,
>> Prabhat Kumar Sahu
>> EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-04-26 08:12:04 Re: WIP: Aggregation push-down
Previous Message Andy Fan 2020-04-26 07:07:00 Re: Subplan result caching