| From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
|---|---|
| To: | Nico Williams <nico(at)cryptonector(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Global temporary tables |
| Date: | 2026-07-03 23:32:42 |
| Message-ID: | CAEZATCWMYVFcvzPuGDwP1T=53zUs8+E2BGGi3YZYS3nh1thRDQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, 2 Jul 2026 at 00:20, Nico Williams <nico(at)cryptonector(dot)com> wrote:
>
> I drove Claude through a review. You can too, so there's nothing
> special to it, but I'll go over it here, with a bit of my content added,
> and hopefully y'all don't hate me for sharing LLM outputs.
Wow, we really will be out of a job in a few years :-)
Thanks for doing that!
It managed to find a lot that I'd overlooked. I'm attaching v6, with
fixes for most of those issues plus a couple more.
> 1. You have an uninitialized have_gtrs in
> src/backend/commands/vacuum.c:504.
Fixed in v6
> 2. TRUNCATE on a GTT can cause corruption.
>
> Claude's description (I don't know how to evaluate this one myself):
>
> RelationSetNewRelfilenumber writes session-local freeze xids into
> shared pg_class for GTTs (relcache.c:4057-4076). Only relfilenode is
> split via SetEffective_relfilenode. relpages=0, reltuples=-1,
> relallvisible=0, relallfrozen=0, relfrozenxid=freezeXid,
> relminmxid=minmulti, relpersistence are written directly into
> classform and committed via CatalogTupleUpdate(pg_class, ...).
Yes, it needed to use the SetEffective_* methods for all those.
> Fix: the post-SetEffective_relfilenode block must update
> temp_classform instead of classform when temp_tuple is valid (or skip
> stats/freeze fields on classform entirely for GTT).
v6 adds new SetEffective_relfrozenxid/relminmxid() methods and uses those.
> 3. vac_get_min_tempfrozenxids() scans ProcGlobal->allProcCount without
> ProcArrayLock (vacuum.c:1769-1799). Every comparable consumer
> (vac_update_datfrozenxid, GetSnapshotData) acquires it.
Fixed in v6.
> 4. eoxact_usage_list_len / eoxact_usage_list_overflowed never reset at
> end-of-xact (global_temp.c:1176-1179). Only the storage list counters
> reset. Once a long-lived session overflows (>32 GTTs touched
> cumulatively), every subsequent xact falls back to full hash_seq
> scans. Correctness preserved, performance silently degrades.
Fixed in v6
> 5. MyProc->tempfrozenxid never decays. Set on first GTT access; only
> moves forward on a session's VACUUM of its own GTT. A backend that
> touches a GTT once and never vacuums holds back pg_class.relfrozenxid
> advancement for all sessions for the connection's lifetime. No path
> clears it on GlobalTempRelationDropped or when local usage_count
> drops to zero.
>
> I explored this one a bit because I was surprised that the frozenxid
> in the catalog should be relevant to GTTs. If I understand correctly
> this tempfrozenxid exists because while the GTT storage is
> per-session, there is some coupling via the CLOG, but I don't
> understand why _that_ has to be, and this is what Claude told me:
>
> An alternative that would have avoided the coordination
>
> You could have designed it so that GTT pg_class rows are skipped
> from datfrozenxid computation (just like local temp), and each
> backend's tempfrozenxid fed directly into vac_update_datfrozenxid
> as an additional constraint. That would let the shared
> pg_class.relfrozenxid be advisory/stale without a safety issue,
> removing one leg of the coordination.
>
> The current design chose to keep the shared row accurate — probably
> because it's user-visible (SELECT relfrozenxid FROM pg_class) and
> drives per-relation autovacuum urgency signals. That's the
> tradeoff: an accurate shared row, in exchange for needing
> cross-session horizon exchange. Same fundamental need as local
> temp; different plumbing because the pg_class row is shared.
>
> And, ah, well, I guess you could say that `relfrozenxid` in
> `pg_class` is not meaningful for GTTs, or that you'll get your
> session's GTT's relfrozenxid.
Yes, that's a very accurate description of the design choice I made.
Having thought more about it, I think the other choice is better,
because it's simpler, and it reduces churn on pg_class.
So v6 now sets pg_class.relfrozenxid and pg_class.relminmxid to 0 for
all global temporary relations, and only the pg_temp_class values are
ever updated. That seems more logical, because a GTT has no storage
until it is used by some session, so it makes sense for the frozen
XIDs in pg_class to be 0.
Also, DROP and TRUNCATE now cause tempfrozenxid and tempminmxid to be
updated. There was another bug there -- things like DROP, REPACK, and
TRUNCATE that advance tempfrozenxid and tempminmxid should only do so
if the transaction commits. Each of those (but not VACUUM) can be
rolled back, in which case tempfrozenxid and tempminmxid should not be
advanced.
> 6. InsertPgTempIndexTuple mutates rel->rd_index->indisvalid in-place
> (global_temp.c:885-892). Relcache fields are normally immutable
> outside relcache.c. Works today but fragile.
Actually, it's InitGlobalTempRelation() that does this, not
InsertPgTempIndexTuple(). InitGlobalTempRelation() is only called from
one function in relcache.c, as the final part of setting up the
relcache entry. In an early prototype, I had InitGlobalTempRelation()
in relcache.c, and I guess it could be moved back, but I don't think
it's particularly bad the way it is.
> 7. AbortTransaction ordering — gtrs_dropped is list_freed
> unconditionally in AtEOXact (:1178) even though the in-source comment
> (:1077-1080) claims it'll be "repeated next commit". Mostly benign
> (storage is reaped at exit) but the comment is wrong.
Yeah, when I originally started writing that, I intended that it would
be repeated on the next commit, but in the end, I decided maybe that
wouldn't be a good idea, because if it failed for some reason, it
might keep failing, making every subsequent transaction fail. So I've
just updated that comment, and left that code as it was.
> 8. DeleteRelationTuple calls GlobalTempRelationDropped after
> CatalogTupleDelete; GlobalTempRelationDropped then calls
> get_rel_relkind(relid) (global_temp.c:1038). After a CCI, the
> syscache will return '\0' and skip DeletePgTempIndexTuple. Pass
> relkind explicitly instead of re-deriving it.
This was a nice catch. I fixed it in v6 by saving the relkind on the
usage entry, so it doesn't need to look it up again.
In addition to those, I also fixed a couple of things in ON COMMIT
DELETE ROWS processing:
1. PreCommit_GlobalTempRelation() needs to be called *before* ON
COMMIT handling, in case another session drops a GTT with ON COMMIT
DELETE ROWS -- we need to remove it from the on_commits list before ON
COMMIT processing happens.
2. The ON COMMIT DELETE ROWS processing now does its work with a
RowExclusiveLock for GTTs, preventing blocking/deadlock risks.
I'll try and post another update soon, addressing other people's
review comments.
Regards,
Dean
| Attachment | Content-Type | Size |
|---|---|---|
| v6-0001-Save-temporary-table-ON-COMMIT-actions-to-pg_clas.patch | text/x-patch | 12.8 KB |
| v6-0002-Basic-support-for-global-temporary-tables.patch | text/x-patch | 139.1 KB |
| v6-0003-Support-indexes-on-global-temporary-tables.patch | text/x-patch | 36.2 KB |
| v6-0004-Support-global-temporary-sequences.patch | text/x-patch | 17.6 KB |
| v6-0005-Support-global-temporary-catalog-tables-and-add-p.patch | text/x-patch | 86.0 KB |
| v6-0006-Add-relation-statistics-columns-to-pg_temp_class.patch | text/x-patch | 52.8 KB |
| v6-0007-Add-relfrozenxid-and-relminmxid-columns-to-pg_tem.patch | text/x-patch | 42.3 KB |
| v6-0008-Add-pg_temp_statistic-global-temporary-catalog-ta.patch | text/x-patch | 32.2 KB |
| v6-0009-Add-pg_temp_statistic_ext_data-global-temporary-c.patch | text/x-patch | 36.7 KB |
| v6-0010-Add-pg_temp_index-global-temporary-catalog-table.patch | text/x-patch | 61.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dean Rasheed | 2026-07-03 23:37:12 | Re: Global temporary tables |
| Previous Message | Alexander Korotkov | 2026-07-03 23:05:32 | Re: Asynchronous MergeAppend |