From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | wenjing(at)gmail(dot)com |
Cc: | Andrew Bille <andrewbille(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tony Zhu <Tony(dot)zhu(at)ww-it(dot)cn>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [Proposal] Global temporary tables |
Date: | 2022-02-25 07:45:00 |
Message-ID: | 20220225074500.sfizxbmlrj2s6hp5@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
This is a huge thread. Realistically reviewers and committers can't reread
it. I think there needs to be more of a description of how this works included
in the patchset and *why* it works that way. The readme does a bit of that,
but not particularly well.
On 2022-02-25 14:26:47 +0800, Wenjing Zeng wrote:
> +++ b/README.gtt.txt
> @@ -0,0 +1,172 @@
> +Global Temporary Table(GTT)
> +=========================================
> +
> +Feature description
> +-----------------------------------------
> +
> +Previously, temporary tables are defined once and automatically
> +exist (starting with empty contents) in every session before using them.
I think for a README "previously" etc isn't good language - if it were
commited, it'd not be understandable anymore. It makes more sense for commit
messages etc.
> +Main design ideas
> +-----------------------------------------
> +In general, GTT and LTT use the same storage and buffer design and
> +implementation. The storage files for both types of temporary tables are named
> +as t_backendid_relfilenode, and the local buffer is used to cache the data.
What does "named ast_backendid_relfilenode" mean?
> +The schema of GTTs is shared among sessions while their data are not. We build
> +a new mechanisms to manage those non-shared data and their statistics.
> +Here is the summary of changes:
> +
> +1) CATALOG
> +GTTs store session-specific data. The storage information of GTTs'data, their
> +transaction information, and their statistics are not stored in the catalog.
> +
> +2) STORAGE INFO & STATISTICS INFO & TRANSACTION INFO
> +In order to maintain durability and availability of GTTs'session-specific data,
> +their storage information, statistics, and transaction information is managed
> +in a local hash table tt_storage_local_hash.
"maintain durability"? Durable across what? In the context of databases it's
typically about crash safety, but that can't be the case here.
> +3) DDL
> +Currently, GTT supports almost all table'DDL except CLUSTER/VACUUM FULL.
> +Part of the DDL behavior is limited by shared definitions and multiple copies of
> +local data, and we added some structures to handle this.
> +A shared hash table active_gtt_shared_hash is added to track the state of the
> +GTT in a different session. This information is recorded in the hash table
> +during the DDL execution of the GTT.
> +The data stored in a GTT can only be modified or accessed by owning session.
> +The statements that only modify data in a GTT do not need a high level of
> +table locking. The operations making those changes include truncate GTT,
> +reindex GTT, and lock GTT.
I think you need to introduce a bit more terminology for any of this to make
sense. Sometimes GTT means the global catalog entity, sometimes, like here, it
appears to mean the session specific contents of a GTT.
What state of a GTT in a nother session?
How do GTTs handle something like BEGIN; TRUNCATE some_gtt_table; ROLLBACK;?
> +1.2 on commit clause
> +LTT's status associated with on commit DELETE ROWS and on commit PRESERVE ROWS
> +is not stored in catalog. Instead, GTTs need a bool value on_commit_delete_rows
> +in reloptions which is shared among sessions.
Why?
> +2.3 statistics info
> +1) relpages reltuples relallvisible relfilenode
?
> +3 DDL
> +3.1. active_gtt_shared_hash
> +This is the hash table created in shared memory to trace the GTT files initialized
> +in each session. Each hash entry contains a bitmap that records the backendid of
> +the initialized GTT file. With this hash table, we know which backend/session
> +is using this GTT. Such information is used during GTT's DDL operations.
So there's a separate locking protocol for GTTs that doesn't use the normal
locking infrastructure? Why?
> +3.7 CLUSTER GTT/VACUUM FULL GTT
> +The current version does not support.
Why?
> +4 MVCC commit log(clog) cleanup
> +
> +The GTT storage file contains transaction information. Queries for GTT data rely
> +on transaction information such as clog. The transaction information required by
> +each session may be completely different.
Why is transaction information different between sessions? Or does this just
mean that different transaction ids will be accessed?
0003-gtt-v67-implementation.patch
71 files changed, 3167 insertions(+), 195 deletions(-)
This needs to be broken into smaller chunks to be reviewable.
> @@ -677,6 +678,14 @@ _bt_getrootheight(Relation rel)
> {
> Buffer metabuf;
>
> + /*
> + * If a global temporary table storage file is not initialized in the
> + * this session, its index does not have a root page, just returns 0.
> + */
> + if (RELATION_IS_GLOBAL_TEMP(rel) &&
> + !gtt_storage_attached(RelationGetRelid(rel)))
> + return 0;
> +
> metabuf = _bt_getbuf(rel, BTREE_METAPAGE, BT_READ);
> metad = _bt_getmeta(rel, metabuf);
Stuff like this seems not acceptable. Accesses would have to be prevented much
earlier. Otherwise each index method is going to need copies of this logic. I
also doubt that _bt_getrootheight() is the only place that'd need this.
> static void
> index_update_stats(Relation rel,
> bool hasindex,
> - double reltuples)
> + double reltuples,
> + bool isreindex)
> {
> Oid relid = RelationGetRelid(rel);
> Relation pg_class;
> @@ -2797,6 +2824,13 @@ index_update_stats(Relation rel,
> Form_pg_class rd_rel;
> bool dirty;
>
> + /*
> + * Most of the global Temp table data is updated to the local hash, and reindex
> + * does not refresh relcache, so call a separate function.
> + */
> + if (RELATION_IS_GLOBAL_TEMP(rel))
> + return index_update_gtt_relstats(rel, hasindex, reltuples, isreindex);
> +
So basically every single place in the code that does catalog accesses is
going to need a completely separate implementation for GTTs? That seems
unmaintainable.
> +/*-------------------------------------------------------------------------
> + *
> + * storage_gtt.c
> + * The body implementation of Global temparary table.
> + *
> + * IDENTIFICATION
> + * src/backend/catalog/storage_gtt.c
> + *
> + * See src/backend/catalog/GTT_README for Global temparary table's
> + * requirements and design.
> + *
> + *-------------------------------------------------------------------------
> + */
I don't think that path to the readme is correct.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-02-25 07:47:01 | Re: Add checkpoint and redo LSN to LogCheckpointEnd log message |
Previous Message | Andres Freund | 2022-02-25 07:14:22 | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations |