Re: [Proposal] Global temporary tables

From: Andres Freund <andres(at)anarazel(dot)de>
To: wenjing(at)gmail(dot)com
Cc: Andrew Bille <andrewbille(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tony Zhu <Tony(dot)zhu(at)ww-it(dot)cn>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Global temporary tables
Date: 2022-02-25 07:45:00
Message-ID: 20220225074500.sfizxbmlrj2s6hp5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

This is a huge thread. Realistically reviewers and committers can't reread
it. I think there needs to be more of a description of how this works included
in the patchset and *why* it works that way. The readme does a bit of that,
but not particularly well.

On 2022-02-25 14:26:47 +0800, Wenjing Zeng wrote:
> +++ b/README.gtt.txt
> @@ -0,0 +1,172 @@
> +Global Temporary Table(GTT)
> +=========================================
> +
> +Feature description
> +-----------------------------------------
> +
> +Previously, temporary tables are defined once and automatically
> +exist (starting with empty contents) in every session before using them.

I think for a README "previously" etc isn't good language - if it were
commited, it'd not be understandable anymore. It makes more sense for commit
messages etc.

> +Main design ideas
> +-----------------------------------------
> +In general, GTT and LTT use the same storage and buffer design and
> +implementation. The storage files for both types of temporary tables are named
> +as t_backendid_relfilenode, and the local buffer is used to cache the data.

What does "named ast_backendid_relfilenode" mean?

> +The schema of GTTs is shared among sessions while their data are not. We build
> +a new mechanisms to manage those non-shared data and their statistics.
> +Here is the summary of changes:
> +
> +1) CATALOG
> +GTTs store session-specific data. The storage information of GTTs'data, their
> +transaction information, and their statistics are not stored in the catalog.
> +
> +2) STORAGE INFO & STATISTICS INFO & TRANSACTION INFO
> +In order to maintain durability and availability of GTTs'session-specific data,
> +their storage information, statistics, and transaction information is managed
> +in a local hash table tt_storage_local_hash.

"maintain durability"? Durable across what? In the context of databases it's
typically about crash safety, but that can't be the case here.

> +3) DDL
> +Currently, GTT supports almost all table'DDL except CLUSTER/VACUUM FULL.
> +Part of the DDL behavior is limited by shared definitions and multiple copies of
> +local data, and we added some structures to handle this.

> +A shared hash table active_gtt_shared_hash is added to track the state of the
> +GTT in a different session. This information is recorded in the hash table
> +during the DDL execution of the GTT.

> +The data stored in a GTT can only be modified or accessed by owning session.
> +The statements that only modify data in a GTT do not need a high level of
> +table locking. The operations making those changes include truncate GTT,
> +reindex GTT, and lock GTT.

I think you need to introduce a bit more terminology for any of this to make
sense. Sometimes GTT means the global catalog entity, sometimes, like here, it
appears to mean the session specific contents of a GTT.

What state of a GTT in a nother session?

How do GTTs handle something like BEGIN; TRUNCATE some_gtt_table; ROLLBACK;?

> +1.2 on commit clause
> +LTT's status associated with on commit DELETE ROWS and on commit PRESERVE ROWS
> +is not stored in catalog. Instead, GTTs need a bool value on_commit_delete_rows
> +in reloptions which is shared among sessions.

Why?

> +2.3 statistics info
> +1) relpages reltuples relallvisible relfilenode

?

> +3 DDL
> +3.1. active_gtt_shared_hash
> +This is the hash table created in shared memory to trace the GTT files initialized
> +in each session. Each hash entry contains a bitmap that records the backendid of
> +the initialized GTT file. With this hash table, we know which backend/session
> +is using this GTT. Such information is used during GTT's DDL operations.

So there's a separate locking protocol for GTTs that doesn't use the normal
locking infrastructure? Why?

> +3.7 CLUSTER GTT/VACUUM FULL GTT
> +The current version does not support.

Why?

> +4 MVCC commit log(clog) cleanup
> +
> +The GTT storage file contains transaction information. Queries for GTT data rely
> +on transaction information such as clog. The transaction information required by
> +each session may be completely different.

Why is transaction information different between sessions? Or does this just
mean that different transaction ids will be accessed?

0003-gtt-v67-implementation.patch
71 files changed, 3167 insertions(+), 195 deletions(-)

This needs to be broken into smaller chunks to be reviewable.

> @@ -677,6 +678,14 @@ _bt_getrootheight(Relation rel)
> {
> Buffer metabuf;
>
> + /*
> + * If a global temporary table storage file is not initialized in the
> + * this session, its index does not have a root page, just returns 0.
> + */
> + if (RELATION_IS_GLOBAL_TEMP(rel) &&
> + !gtt_storage_attached(RelationGetRelid(rel)))
> + return 0;
> +
> metabuf = _bt_getbuf(rel, BTREE_METAPAGE, BT_READ);
> metad = _bt_getmeta(rel, metabuf);

Stuff like this seems not acceptable. Accesses would have to be prevented much
earlier. Otherwise each index method is going to need copies of this logic. I
also doubt that _bt_getrootheight() is the only place that'd need this.

> static void
> index_update_stats(Relation rel,
> bool hasindex,
> - double reltuples)
> + double reltuples,
> + bool isreindex)
> {
> Oid relid = RelationGetRelid(rel);
> Relation pg_class;
> @@ -2797,6 +2824,13 @@ index_update_stats(Relation rel,
> Form_pg_class rd_rel;
> bool dirty;
>
> + /*
> + * Most of the global Temp table data is updated to the local hash, and reindex
> + * does not refresh relcache, so call a separate function.
> + */
> + if (RELATION_IS_GLOBAL_TEMP(rel))
> + return index_update_gtt_relstats(rel, hasindex, reltuples, isreindex);
> +

So basically every single place in the code that does catalog accesses is
going to need a completely separate implementation for GTTs? That seems
unmaintainable.

> +/*-------------------------------------------------------------------------
> + *
> + * storage_gtt.c
> + * The body implementation of Global temparary table.
> + *
> + * IDENTIFICATION
> + * src/backend/catalog/storage_gtt.c
> + *
> + * See src/backend/catalog/GTT_README for Global temparary table's
> + * requirements and design.
> + *
> + *-------------------------------------------------------------------------
> + */

I don't think that path to the readme is correct.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-02-25 07:47:01 Re: Add checkpoint and redo LSN to LogCheckpointEnd log message
Previous Message Andres Freund 2022-02-25 07:14:22 Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations