Re: [Proposal] Global temporary tables

From: Wenjing Zeng <wjzeng2012(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: wenjing <wenjing(at)gmail(dot)com>, Andrew Bille <andrewbille(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tony Zhu <Tony(dot)zhu(at)ww-it(dot)cn>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [Proposal] Global temporary tables
Date: 2022-03-01 07:10:11
Message-ID: 17EEA6A4-3FA5-4C01-81BB-26D32BEC6343@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2022年2月27日 08:21,Justin Pryzby <pryzby(at)telsasoft(dot)com> 写道:
>
> I read through this.
> Find attached some language fixes. You should be able to apply each "fix"
> patch on top of your own local branch with git am, and then squish them
> together. Let me know if you have trouble with that.
>
> I think get_seqence_start_value() should be static. (Or otherwise, it should
> be in lsyscache.c).
>
> The include added to execPartition.c seems to be unused.
>
> +#define RELATION_IS_TEMP_ON_CURRENT_SESSION(relation) \
> +#define RELATION_IS_TEMP(relation) \
> +#define RelpersistenceTsTemp(relpersistence) \
> +#define RELATION_GTT_ON_COMMIT_DELETE(relation) \
>
> => These macros can evaluate their arguments multiple times.
> You should add a comment to warn about that. And maybe avoid passing them a
> function argument, like: RelpersistenceTsTemp(get_rel_persistence(rte->relid))
>
> +list_all_backend_gtt_frozenxids should return TransactionId not int.
> The function name should say "oldest" and not "all" ?
>
> I think the GUC should have a longer name. max_active_gtt is too short for a
> global var.
>
> +#define MIN_NUM_ACTIVE_GTT 0
> +#define DEFAULT_NUM_ACTIVE_GTT 1000
> +#define MAX_NUM_ACTIVE_GTT 1000000
>
> +int max_active_gtt = MIN_NUM_ACTIVE_GTT
>
> It's being initialized to MIN, but then the GUC machinery sets it to DEFAULT.
> By convention, it should be initialized to default.
>
> fout->remoteVersion >= 140000
>
> => should say 15
>
> describe.c has gettext_noop("session"), which is a half-truth. The data is
> per-session but the table definition is persistent..
Thanks for your advice, I will try to merge this part of the code.

>
> You redirect stats from pg_class and pg_statistics to a local hash table.
> This is pretty hairy :(
> I guess you'd also need to handle pg_statistic_ext and ext_data.
> pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to look
> at pg_get_gtt_statistics.
>
> I wonder if there's a better way to do it, like updating pg_statistic but
> forcing the changes to be rolled back when the session ends... But I think
> that would make longrunning sessions behave badly, the same as "longrunning
> transactions".

There are three pieces of data related to session-level GTT data that need to be managed
1 session-level storage info like relfilenode
2 session-level like relfrozenxid
3 session-level stats like relpages or column stats

I think the 1 and 2 are necessary, but not for stats.
In the previous email, It has been suggested that GTT statistics not be processed.
This means that GTT statistics are not recorded in the localhash or catalog.
In my observation, very few users require an accurate query plan for temporary tables to
perform manual analyze.
Of course, doing this will also avoid catalog bloat and performance problems.

>
> Have you looked at Gilles Darold's GTT extension ?
If you are referring to https://github.com/darold/pgtt <https://github.com/darold/pgtt> , yes.
It is smart to use unlogged table as a template and then use LTT to read and write data.
For this implementation, I want to point out two things:
1 For the first insert of GTT in each session, create table or create index is implicitly executed.
2 The catalog bloat caused by LTT still exist.

Regards, Wenjing.

> <0002-f-0002-gtt-v64-doc.txt><0004-f-0003-gtt-v64-implementation.txt><0006-f-0004-gtt-v64-regress.txt>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-03-01 07:12:25 Re: Failed transaction statistics to measure the logical replication progress
Previous Message Yugo NAGATA 2022-03-01 06:55:59 Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error