Quick Links

Re: Eliminating SPI / SQL from some RI triggers - take 3

From:	Amit Langote <amitlangote09(at)gmail(dot)com>
To:	Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc:	Haibo Yan <tristan(dot)yim(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject:	Re: Eliminating SPI / SQL from some RI triggers - take 3
Date:	2026-03-30 04:55:07
Message-ID:	CA+HiwqF_Kz=R8juHJBiOATvabWSOugK4VaGOxoJ_n=E2c7UM9w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Mar 25, 2026 at 9:41 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Tue, Mar 24, 2026 at 10:56 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Tue, Mar 24, 2026 at 8:47 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > >
> > > Hi Junwang,
> > >
> > > On Fri, Mar 20, 2026 at 1:20 AM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> > > > I squashed 0004 into 0003 so that each file can be committed independently.
> > > > I also runned pgindent for each file.
> > >
> > > Thanks for that.
> > >
> > > Here's another version.
> > >
> > > In 0001, I noticed that the condition change in ri_HashCompareOp could
> > > be simplified further. Also improved the commentary surrounding that.
> > > I also updated the commit message to clarify parity with the SPI path.
> > >
> > > Updated the commit message of 0002 to talk about why caching the
> > > snapshot for the entire trigger firing cycle of a given constraint
> > > makes a trade off compared to the SPI path which retakes the snapshot
> > > for every row checked and could in principle avoid failure for FK rows
> > > whose corresponding PK row was added by a concurrently committed
> > > transaction, at least in the READ COMMITTED case.
> > >
> > > Updated the commit message of 0003 to clarify that it replaces
> > > ri_FastPathCheckCached() from 0002 with the BatchAdd/BatchFlush pair,
> > > and that the cached resources are used unchanged -- only the probing
> > > cadence changes from per-row to per-flush. Per-flush CCI is safe
> > > because all AFTER triggers for the buffered rows have already fired
> > > by flush time; a new test case is added to show that.
> >
> > Kept thinking about this on a walk after I sent this and came to the
> > conclusion that it might be better to just not cache the snapshot with
> > only the above argument in its favor. If repeated GetSnapshotData()
> > is expensive, the solution should be to fix that instead of simply
> > side-stepping it.
> >
> > By taking a snapshot per-batch without caching it, and so likewise the
> > IndexScanDesc, I'm seeing the same ~3x speedup in the batched
> > SK_SEARCHARRAY case, so I don't see much point in being very stubborn
> > about snapshot caching. Like in the attached (there's an unrelated
> > memory context switch thinko fix). Note that relations (pk_rel,
> > idx_rel) and the slot remain cached across the batch; only the
> > snapshot and scandesc are taken fresh per flush.
> >
> > I'll post an updated version tomorrow morning. I think it might be
> > better to just merge 0003 into 0002, because without snapshot and
> > scandesc caching the standalone value of 0002 is mostly just relation
> > and slot caching -- the interesting parts (batch callbacks, lifecycle
> > management) are all scaffolding for the batching. So v10 will be two
> > patches: 0001 core fast path, 0002 everything else.
>
> And here's a set like that. I noticed that we don't need a dedicated
> scan_cxt now that scandesc is not cached and a few other
> simplifications.

Junwang pointed out off-list that FK tuples added to
RI_FastPathEntry.batch[] were being copied into TopTransactionContext
rather than flush_cxt, so they would accumulate until the batch was
exhausted rather than being reclaimed per flush. Fixed in
ri_FastPathBatchAdd() in 0002.

Also added a couple of comments in trigger.c that were missing: an
Assert and explanation in RegisterAfterTriggerBatchCallback()
clarifying the query_depth >= 0 precondition, a comment at the
AfterTriggerEndQuery call site explaining why
FireAfterTriggerBatchCallbacks() must precede the query_depth
decrement and AfterTriggerFreeQuery, and brief intent comments at the
AfterTriggerFireDeferred and AfterTriggerSetState call sites.

Plan is to commit 0001 tomorrow barring objections and let it sit for
a bit before committing 0002. Feedback on 0002, particularly on the
AfterTriggerBatchCallback mechanism in trigger.c, welcome in the
meantime.

--
Thanks, Amit Langote

Attachment	Content-Type	Size
v11-0002-Batch-FK-rows-and-use-SK_SEARCHARRAY-for-fast-pa.patch	application/octet-stream	42.6 KB
v11-0001-Add-fast-path-for-foreign-key-constraint-checks.patch	application/octet-stream	30.9 KB

In response to

Re: Eliminating SPI / SQL from some RI triggers - take 3 at 2026-03-25 00:41:11 from Amit Langote

Responses

Re: Eliminating SPI / SQL from some RI triggers - take 3 at 2026-03-30 11:15:05 from Amit Langote

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Lukas Fittl	2026-03-30 05:21:46	Re: Refactor query normalization into core query jumbling
Previous Message	Ashutosh Bapat	2026-03-30 04:50:22	Re: Better shared data structure management and resizable shared data structures