Re: [Proposal] Global temporary tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: 曾文旌 <wenjing(dot)zwj(at)alibaba-inc(dot)com>
Cc: Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2020-04-23 07:43:53
Message-ID: CAFj8pRDVURvQBZhGeUEBBTaqnjfK3w7u+xZfAqVB+-ncUezFZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 23. 4. 2020 v 9:10 odesílatel 曾文旌 <wenjing(dot)zwj(at)alibaba-inc(dot)com> napsal:

>
>
> 2020年4月22日 下午10:50,Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> 写道:
>
>
>
> st 22. 4. 2020 v 16:38 odesílatel Prabhat Sahu <
> prabhat(dot)sahu(at)enterprisedb(dot)com> napsal:
>
>>
>>
>> On Wed, Apr 22, 2020 at 2:49 PM 曾文旌 <wenjing(dot)zwj(at)alibaba-inc(dot)com> wrote:
>>
>>>
>>> Although the implementation of GTT is different, I think so TRUNCATE on
>>> Postgres (when it is really finalized) can remove session metadata of GTT
>>> too (and reduce usage's counter). It is not critical feature, but I think
>>> so it should not be hard to implement. From practical reason can be nice to
>>> have a tool how to refresh GTT without a necessity to close session.
>>> TRUNCATE can be this tool.
>>>
>>> Sorry, I don't quite understand what you mean, could you describe it in
> detail?
> In my opinion the TRUNCATE GTT cannot clean up data in other sessions,
> especially clean up local buffers in other sessions.
>

It is about a possibility to force reset GTT to default empty state for all
sessions.

Maybe it is some what does your TRUNCATE DROP, but I don't think so this
design (TRUNCATE DROP) is good, because then user have to know
implementation detail.

I prefer some like TRUNCATE tab WITH OPTION (GLOBAL, FORCE) - "GLOBAL" ..
apply on all sessions, FORCE try to do without waiting on some global lock,
try to do immediately with possibility to cancel some statements and
rollback some session.

instead GLOBAL maybe we can use "ALLSESSION", or "ALL SESSION" or some else

But I like possible terminology LOCAL x GLOBAL for GTT. What I mean? Some
statements like "TRUNCATE" can works (by default) in "local" mode .. it
has impact to current session only. But sometimes can be executed in
"global" mode with effect on all sessions.

>
> Yes, I think we need a way to delete the GTT local storage without closing
>>> the session.
>>>
>>> I provide the TRUNCATE tablename DROP to clear the data in the GTT and
>>> delete the storage files.
>>> This feature requires the current transaction to commit immediately
>>> after it finishes truncate.
>>>
>>
>> Hi Wenjing,
>> Thanks for the patch(v30) for the new syntax support for (TRUNCATE
>> table_name DROP) for deleting storage files after TRUNCATE on GTT.
>>
>
> This syntax looks strange, and I don't think so it solve anything in
> practical life, because without lock the table will be used in few seconds
> by other sessions.
>
>
> If a dba wants to delete or modify a GTT, he can use locks to help him
> make the change.
>

> postgres=# begin;
> BEGIN
> postgres=*# LOCK TABLE gtt2 IN ACCESS EXCLUSIVE MODE;
> postgres=*# select * from pg_gtt_attached_pids ;
>
> Kill session or let session do TRUNCATE tablename DROP
>
> postgres=*# drop table gtt2;
> DROP TABLE
> postgres=*# commit;
> COMMIT
>

yes, user can lock a tables. But I think so it is user friendly design. I
don't remember any statement in Postgres, where I have to use table locks
explicitly.

For builtin commands it should be done transparently (for user).

Regards

Pavel

>
>
> This is same topic when we talked about ALTER - when and where the changes
> should be applied.
>
> The CLUSTER commands works only on session private data, so it should not
> to need some special lock or some special cleaning before.
>
> Regards
>
> Pavel
>
>
>>
>> Please check below scenarios:
>>
>>
>> *Case1:*-- session1:
>> postgres=# create global temporary table gtt2 (c1 integer) on commit
>> preserve rows;
>> CREATE TABLE
>> postgres=# create index idx1 on gtt2 (c1);
>> CREATE INDEX
>> postgres=# create index idx2 on gtt2 (c1) where c1%2 =0;
>> CREATE INDEX
>> postgres=#
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> ERROR: cannot cluster on partial index "idx2"
>>
>>
>> *Case2:*-- Session2:
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> CLUSTER
>>
>> postgres=# insert into gtt2 values(1);
>> INSERT 0 1
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> ERROR: cannot cluster on partial index "idx2"
>>
>>
>> *Case3:*-- Session2:
>> postgres=# TRUNCATE gtt2 DROP;
>> TRUNCATE TABLE
>> postgres=# CLUSTER gtt2 USING idx1;
>> CLUSTER
>> postgres=# CLUSTER gtt2 USING idx2;
>> CLUSTER
>>
>> In Case2, Case3 we can observe, with the absence of data in GTT, we are
>> able to "CLUSTER gtt2 USING idx2;" (having partial index)
>> But why does the same query fail for Case1 (absence of data)?
>>
>> Thanks,
>> Prabhat Sahu
>>
>>
>>
>>>
>>>
>>> Wenjing
>>>
>>>
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>>> All in all, I think the current implementation is sufficient for dba to
>>>> manage GTT.
>>>>
>>>> 2020年4月2日 下午4:45,Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com> 写道:
>>>>
>>>> Hi All,
>>>>
>>>> I have noted down few behavioral difference in our GTT implementation
>>>> in PG as compared to Oracle DB:
>>>> As per my understanding, the behavior of DROP TABLE in case of "Normal
>>>> table and GTT" in Oracle DB are as below:
>>>>
>>>> 1. Any tables(Normal table / GTT) without having data in a session,
>>>> we will be able to DROP from another session.
>>>> 2. For a completed transaction on a normal table having data, we
>>>> will be able to DROP from another session. If the transaction is not yet
>>>> complete, and we are trying to drop the table from another session, then we
>>>> will get an error. (working as expected)
>>>> 3. For a completed transaction on GTT with(on commit delete rows)
>>>> (i.e. no data in GTT) in a session, we will be able to DROP from another
>>>> session.
>>>> 4. For a completed transaction on GTT with(on commit preserve rows)
>>>> with data in a session, we will not be able to DROP from any session(not
>>>> even from the session in which GTT is created), we need to truncate the
>>>> table data first from all the session(session1, session2) which is having
>>>> data.
>>>>
>>>> *1. Any tables(Normal table / GTT) without having data in a session, we
>>>> will be able to DROP from another session.*
>>>> *Session1:*
>>>> create table t1 (c1 integer);
>>>> create global temporary table gtt1 (c1 integer) on commit delete rows;
>>>> create global temporary table gtt2 (c1 integer) on commit preserve rows;
>>>>
>>>> *Session2:*
>>>> drop table t1;
>>>> drop table gtt1;
>>>> drop table gtt2;
>>>>
>>>> -- *Issue 1:* But we are able to drop a simple table and failed to
>>>> drop GTT as below.
>>>>
>>>> postgres=# drop table t1;
>>>> DROP TABLE
>>>> postgres=# drop table gtt1;
>>>> ERROR: can not drop relation gtt1 when other backend attached this
>>>> global temp table
>>>> postgres=# drop table gtt2;
>>>> ERROR: can not drop relation gtt2 when other backend attached this
>>>> global temp table
>>>>
>>>>
>>>> *3. For a completed transaction on GTT with(on commit delete rows)
>>>> (i.e. no data in GTT) in a session, we will be able to DROP from another
>>>> session.*
>>>>
>>>> *Session1:*create global temporary table gtt1 (c1 integer) on commit
>>>> delete rows;
>>>>
>>>> *Session2:*
>>>> drop table gtt1;
>>>>
>>>> -- *Issue 2:* But we are getting error for GTT
>>>> with(on_commit_delete_rows) without data.
>>>>
>>>> postgres=# drop table gtt1;
>>>> ERROR: can not drop relation gtt1 when other backend attached this
>>>> global temp table
>>>>
>>>>
>>>> *4. For a completed transaction on GTT with(on commit preserve
>>>> rows) with data in any session, we will not be able to DROP from any
>>>> session(not even from the session in which GTT is created)*
>>>>
>>>> *Case1:*
>>>> create global temporary table gtt2 (c1 integer) on commit preserve rows;
>>>> insert into gtt2 values(100);
>>>> drop table gtt2;
>>>>
>>>> SQL> drop table gtt2;
>>>> drop table gtt2
>>>> *
>>>> ERROR at line 1:
>>>> ORA-14452: attempt to create, alter or drop an index on temporary table
>>>> already in use
>>>>
>>>> -- *Issue 3:* But, we are able to drop the GTT(having data) which we
>>>> have created in the same session.
>>>>
>>>> postgres=# drop table gtt2;
>>>> DROP TABLE
>>>>
>>>>
>>>>
>>>>
>>>> *Case2: GTT with(on commit preserve rows) having data in both session1
>>>> and session2Session1:*create global temporary table gtt2 (c1 integer)
>>>> on commit preserve rows;
>>>> insert into gtt2 values(100);
>>>>
>>>>
>>>> *Session2:*insert into gtt2 values(200);
>>>>
>>>> -- If we try to drop the table from any session we should get an error,
>>>> it is working fine.
>>>> drop table gtt2;
>>>>
>>>> SQL> drop table gtt2;
>>>> drop table gtt2
>>>> *
>>>> ERROR at line 1:
>>>> ORA-14452: attempt to create, alter or drop an index on temporary table
>>>> already in use
>>>>
>>>> postgres=# drop table gtt2 ;
>>>> ERROR: can not drop relation gtt2 when other backend attached this
>>>> global temp table
>>>>
>>>>
>>>> -- To drop the table gtt2 from any session1/session2, we need to
>>>> truncate the table data first from all the session(session1, session2)
>>>> which is having data.
>>>> *Session1:*
>>>> truncate table gtt2;
>>>> -- Session2:
>>>> truncate table gtt2;
>>>>
>>>> *Session 2:*
>>>> SQL> drop table gtt2;
>>>>
>>>> Table dropped.
>>>>
>>>> -- *Issue 4:* But we are not able to drop the GTT, even after TRUNCATE
>>>> the table in all the sessions.
>>>> -- truncate from all sessions where GTT have data.
>>>> postgres=# truncate gtt2 ;
>>>> TRUNCATE TABLE
>>>>
>>>> -- *try to DROP GTT still, we are getting error.*
>>>>
>>>> postgres=# drop table gtt2 ;
>>>> ERROR: can not drop relation gtt2 when other backend attached this
>>>> global temp table
>>>>
>>>>
>>>> To drop the GTT from any session, we need to exit from all other
>>>> sessions.
>>>> postgres=# drop table gtt2 ;
>>>> DROP TABLE
>>>>
>>>> Kindly let me know if I am missing something.
>>>>
>>>>
>>>> On Wed, Apr 1, 2020 at 6:26 PM Prabhat Sahu <
>>>> prabhat(dot)sahu(at)enterprisedb(dot)com> wrote:
>>>>
>>>>> Hi Wenjing,
>>>>> I hope we need to change the below error message.
>>>>>
>>>>> postgres=# create global temporary table gtt(c1 int) on commit
>>>>> preserve rows;
>>>>> CREATE TABLE
>>>>>
>>>>> postgres=# create materialized view mvw as select * from gtt;
>>>>> ERROR: materialized views must not use global temporary tables* or
>>>>> views*
>>>>>
>>>>> Anyways we are not allowed to create a "global temporary view",
>>>>> so the above ERROR message should change(i.e. *" or view"* need to be
>>>>> removed from the error message) something like:
>>>>> *"ERROR: materialized views must not use global temporary tables"*
>>>>>
>>>>> --
>>>>>
>>>>> With Regards,
>>>>> Prabhat Kumar Sahu
>>>>> EnterpriseDB: http://www.enterprisedb.com
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> With Regards,
>>>> Prabhat Kumar Sahu
>>>> EnterpriseDB: http://www.enterprisedb.com
>>>>
>>>>
>>>>
>>>
>>
>> --
>>
>> With Regards,
>> Prabhat Kumar Sahu
>> EnterpriseDB: http://www.enterprisedb.com
>>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asif Rehman 2020-04-23 08:17:03 Re: WIP/PoC for parallel backup
Previous Message Richard Guo 2020-04-23 07:28:21 Re: sqlsmith crash incremental sort