Re: [Proposal] Global temporary tables

From: wenjing zeng <wjzeng2012(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "蔡松露(子嘉)" <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2020-07-22 11:53:25
Message-ID: B2215933-0AFA-4944-A003-2D10C3DE5932@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2020年7月14日 下午10:28,Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> 写道:
>
>
>
> pá 10. 7. 2020 v 11:04 odesílatel wenjing zeng <wjzeng2012(at)gmail(dot)com <mailto:wjzeng2012(at)gmail(dot)com>> napsal:
> HI all
>
> I started using my personal email to respond to community issue.
>
>
>
>> 2020年7月7日 下午6:05,Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com <mailto:pavel(dot)stehule(at)gmail(dot)com>> 写道:
>>
>> Hi
>>
>> GTT Merge the latest PGMaster and resolves conflicts.
>>
>>
>>
>> I tested it and it looks fine. I think it is very usable in current form, but still there are some issues:
>>
>> postgres=# create global temp table foo(a int);
>> CREATE TABLE
>> postgres=# insert into foo values(10);
>> INSERT 0 1
>> postgres=# alter table foo add column x int;
>> ALTER TABLE
>> postgres=# analyze foo;
>> WARNING: reloid 16400 not support update attstat after add colunm
>> WARNING: reloid 16400 not support update attstat after add colunm
>> ANALYZE
> This is a limitation that we can completely eliminate.
>
>>
>> Please, can you summarize what is done, what limits are there, what can be implemented hard, what can be implemented easily?
> Sure.
>
> The current version of the GTT implementation supports all regular table operations.
> 1 what is done
> 1.1 insert/update/delete on GTT.
> 1.2 The GTT supports all types of indexes, and the query statement supports the use of GTT indexes to speed up the reading of data in the GTT.
> 1.3 GTT statistics keep a copy of THE GTT local statistics, which are provided to the optimizer to choose the best query plan.
> 1.4 analyze vacuum GTT.
> 1.5 truncate cluster GTT.
> 1.6 all DDL on GTT.
> 1.7 GTT table can use GTT sequence or Regular sequence.
> 1.8 Support for creating views on GTT.
> 1.9 Support for creating views on foreign key.
> 1.10 support global temp partition.
>
> I feel like I cover all the necessary GTT requirements.
>
> For cluster GTT,I think it's complicated.
> I'm not sure the current implementation is quite reasonable. Maybe you can help review it.
>
>
>>
>>
>>
>> I found one open question - how can be implemented table locks - because data is physically separated, then we don't need table locks as protection against race conditions.
> Yes, but GTT’s DML DDL still requires table locking.
> 1 The DML requires table locks (RowExclusiveLock) to ensure that
> definitions do not change during run time (the DDL may modify or delete them).
> This part of the implementation does not actually change the code,
> because the DML on GTT does not block each other between sessions.
>
> 2 For truncate/analyze/vacuum reinidex cluster GTT is now like DML,
> they only modify local data and do not modify the GTT definition.
> So I lowered the table lock level held by the GTT, only need RowExclusiveLock.
>
> 3 For DDLs that need to be modified the GTT table definition(Drop GTT Alter GTT),
> an exclusive level of table locking is required(AccessExclusiveLock),
> as is the case for regular table.
> This part of the implementation also does not actually change the code.
>
> Summary: What I have done is to adjust the GTT lock levels in different types of statements based on the above thinking.
> For example, truncate GTT, I'm reducing the GTT holding table lock level to RowExclusiveLock,
> So We can truncate data in the same GTT between different sessions at the same time.
>
> What do you think about table locks on GTT?
>
> I am thinking about explicit LOCK statements. Some applications use explicit locking from some reasons - typically as protection against race conditions.
>
> But on GTT race conditions are not possible. So my question is - does the exclusive lock on GTT protection other sessions do insert into their own instances of the same GTT?
In my opinion, with a GTT, always work on the private data of the session,
there is no need to do anything by holding the lock, so the lock statement should do nothing (The same is true for ORACLE GTT)

What do you think?

>
> What is a level where table locks are active? shared part of GTT or session instance part of GTT?
I don't quite understand what you mean, could you explain it a little bit?

Wenjing

>
>
>
>
>
> Wenjing
>
>
>>
>> Now, table locks are implemented on a global level. So exclusive lock on GTT in one session block insertion on the second session. Is it expected behaviour? It is safe, but maybe it is too strict.
>>
>> We should define what table lock is meaning on GTT.
>>
>> Regards
>>
>> Pavel
>>
>> Pavel
>>
>>
>>> With Regards,
>>> Prabhat Kumar Sahu
>>> EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>
>>
>>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-07-22 12:25:29 Re: expose parallel leader in CSV and log_line_prefix
Previous Message Dilip Kumar 2020-07-22 11:24:52 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions