Re: [Proposal] Global temporary tables

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2020-01-24 08:20:09
Message-ID: db3f9b60-6d8b-5eb2-aa1b-8aaf33b62a14@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23.01.2020 19:28, 曾文旌(义从) wrote:
>
> I'm trying to improve this part of the implementation in
> global_temporary_table_v7-pg13.patch
> Please check my patch and give me feedback.
>
>
> Thanks
>
> Wenjing
>
>

Below is my short review of the patch:

+    /*
+     * For global temp table only
+     * use AccessExclusiveLock for ensure safety
+     */
+    {
+        {
+            "on_commit_delete_rows",
+            "global temp table on commit options",
+            RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
+            ShareUpdateExclusiveLock
+        },
+        true
+    },

The comment seems to be confusing: it says about AccessExclusiveLock but
actually uses ShareUpdateExclusiveLock.

- Assert(TransactionIdIsNormal(onerel->rd_rel->relfrozenxid));
-    Assert(MultiXactIdIsValid(onerel->rd_rel->relminmxid));
+    Assert((RELATION_IS_GLOBAL_TEMP(onerel) &&
onerel->rd_rel->relfrozenxid == InvalidTransactionId) ||
+        (!RELATION_IS_GLOBAL_TEMP(onerel) &&
TransactionIdIsNormal(onerel->rd_rel->relfrozenxid)));
+    Assert((RELATION_IS_GLOBAL_TEMP(onerel) &&
onerel->rd_rel->relminmxid == InvalidMultiXactId) ||
+        (!RELATION_IS_GLOBAL_TEMP(onerel) &&
MultiXactIdIsValid(onerel->rd_rel->relminmxid)));

It is actually equivalent to:

Assert(RELATION_IS_GLOBAL_TEMP(onerel) ^
TransactionIdIsNormal(onerel->rd_rel->relfrozenxid);
Assert(RELATION_IS_GLOBAL_TEMP(onerel) ^
MultiXactIdIsValid(onerel->rd_rel->relminmxid));

+    /* clean temp relation files */
+    if (max_active_gtt > 0)
+        RemovePgTempFiles();
+
     /*

I wonder why do we need some special check for GTT here.
From my point of view cleanup at startup of local storage of temp
tables should be performed in the same way for local and global temp tables.

-    new_rel_reltup->relfrozenxid = relfrozenxid;
-    new_rel_reltup->relminmxid = relminmxid;
+    /* global temp table not remember transaction info in catalog */
+    if (relpersistence == RELPERSISTENCE_GLOBAL_TEMP)
+    {
+        new_rel_reltup->relfrozenxid = InvalidTransactionId;
+        new_rel_reltup->relminmxid = InvalidMultiXactId;
+    }
+    else
+    {
+        new_rel_reltup->relfrozenxid = relfrozenxid;
+        new_rel_reltup->relminmxid = relminmxid;
+    }
+

Why do we need to do it for GTT?
Did you check that there will be no problems with GTT in case of XID
wraparound?
Right now if you create temp table and keep session open, then it will
block XID wraparound.

+    /* We allow to drop global temp table only this session use it */
+    if (RELATION_IS_GLOBAL_TEMP(rel))
+    {
+        if (is_other_backend_use_gtt(rel->rd_node))
+            elog(ERROR, "can not drop relation when other backend
attached this global temp table");
+    }
+

Here we once again introduce incompatibility with normal (permanent) tables.
Assume that DBA or programmer need to change format of GTT. But there
are some active sessions which have used this GTT sometime in the past.
We will not be able to drop this GTT until all this sessions are terminated.
I do not think that it is acceptable behaviour.

+        LOCKMODE    lockmode = AccessExclusiveLock;
+
+        /* truncate global temp table only need RowExclusiveLock */
+        if (get_rel_persistence(rid) == RELPERSISTENCE_GLOBAL_TEMP)
+            lockmode = RowExclusiveLock;

What are the reasons of using RowExclusiveLock for GTT instead of
AccessExclusiveLock?
Yes, GTT data is access only by one backend so no locking here seems to
be needed at all.
But I wonder what are the motivations/benefits of using weaker lock
level here?
There should be no conflicts in any case...

+        /* We allow to create index on global temp table only this
session use it */
+        if (is_other_backend_use_gtt(heapRelation->rd_node))
+            elog(ERROR, "can not create index when have other backend
attached this global temp table");
+

The same argument as in case of dropping GTT: I do not think that
prohibiting DLL operations on GTT used by more than one backend is bad idea.

+    /* global temp table not support foreign key constraint yet */
+    if (RELATION_IS_GLOBAL_TEMP(pkrel))
+        ereport(ERROR,
+                (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                 errmsg("referenced relation \"%s\" is not a global
temp table",
+                        RelationGetRelationName(pkrel))));
+

Why do we need to prohibit foreign key constraint on GTT?

+    /*
+     * Global temp table get frozenxid from MyProc
+     * to avoid the vacuum truncate clog that gtt need.
+     */
+    if (max_active_gtt > 0)
+    {
+        TransactionId oldest_gtt_frozenxid =
+            list_all_session_gtt_frozenxids(0, NULL, NULL, NULL);
+
+        if (TransactionIdIsNormal(oldest_gtt_frozenxid) &&
+            TransactionIdPrecedes(oldest_gtt_frozenxid, newFrozenXid))
+        {
+            ereport(WARNING,
+                (errmsg("global temp table oldest FrozenXid is far in
the past"),
+                 errhint("please truncate them or kill those sessions
that use them.")));
+            newFrozenXid = oldest_gtt_frozenxid;
+        }
+    }
+

As far as I understand, content of GTT will never be processes by
autovacuum.
So who will update frozenxid of GTT?
I see that up_gtt_relstats is invoked when:
- index is created on GTT
- GTT is truncated
- GTT is vacuumed
So unless GTT is explicitly vacuumed by user, its GTT is and them will
not be taken in account
when computing new frozen xid value. Autovacumm will produce this
warnings (which will ton be visible by end user and only appended to the
log).
And at some moment of time wrap around happen and if there still some
old active GTT, we will get incorrect results.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-01-24 08:27:10 Re: polymorphic table functions light
Previous Message Peter Eisentraut 2020-01-24 08:11:04 Re: polymorphic table functions light