Re: Eliminating SPI / SQL from some RI triggers - take 3

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Haibo Yan <tristan(dot)yim(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
Date: 2026-04-02 07:41:30
Message-ID: CA+HiwqF2Ma6R_QyjfPBvFreYbezFpGcwASJE0a2bM+Y0jvof+g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 1, 2026 at 9:18 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Wed, Apr 1, 2026 at 8:56 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> > On Wed, Apr 1, 2026 at 5:51 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Apr 1, 2026 at 5:51 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > > On Wed, Apr 1, 2026 at 12:54 AM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> > > > > + if (riinfo->fpmeta == NULL)
> > > > > + {
> > > > > + /* Reload to ensure it's valid. */
> > > > > + riinfo = ri_LoadConstraintInfo(riinfo->constraint_id);
> > > > >
> > > > > I was thinking of wrapping the reload in a conditional check like
> > > > > `!riinfo->valid`, since `riinfo` can be valid even when `fpmeta == NULL`.
> > > > > However, `if (riinfo->fpmeta == NULL)` should rarely be true, so the
> > > > > unconditional reload is harmless, and the code is cleaner.
> > > > >
> > > > > +1 to the fix.
> > > >
> > > > Thanks for checking.
> > > >
> > > > I have just pushed a slightly modified version of that.
> > > >
> > > > > > 0002 is the rebased batching patch.
> > > > >
> > > > > The change of RI_FastPathEntry from storing riinfo to fk_relid
> > > > > makes sense to me. I'll do another review on 0002 tomorrow.
> > > >
> > > > Here's another version.
> > > >
> > > > This time, I have another fixup patch (0001) to make FastPathMeta
> > > > self-contained by copying the FmgrInfo structs it needs out of
> > > > RI_CompareHashEntry rather than storing pointers into it. This avoids
> > > > any dependency on those cache entries remaining stable. I'll push
> > > > that once the just committed patch has seen enough BF animals.
> > >
> > > Pushed.
> > >
> > > > 0002 is rebased over that.
> > >
> > > Rebased again.
> >
> > +static void
> > +ri_FastPathBatchFlush(RI_FastPathEntry *fpentry, Relation fk_rel)
> > +{
> > + /* Reload; may have been invalidated since last batch accumulation. */
> > + const RI_ConstraintInfo *riinfo = ri_LoadConstraintInfo(fpentry->conoid);
> >
> > ...
> > + if (riinfo->fpmeta == NULL)
> > + {
> > + /* Reload to ensure it's valid. */
> > + riinfo = ri_LoadConstraintInfo(riinfo->constraint_id);
> > + ri_populate_fastpath_metadata((RI_ConstraintInfo *) riinfo,
> > + fk_rel, idx_rel);
> > + }
> >
> > ri_LoadConstraintInfo is currently invoked twice within
> > ri_FastPathBatchFlush. Should we eliminate the second call?
>
> I think we can't because the entry may be stale by the time we get to
> the ri_populate_fastpath_metadata() call due to intervening steps;
> even something as benign-looking index_beginscan() may call code paths
> that can trigger invalidation in rare cases. Maybe predictably in
> CLOBBER_CACHE_ALWAYS builds.
>
> > Alternatively, we could refactor ri_FastPathBatchFlush to accept
> > an additional parameter, `const RI_ConstraintInfo *riinfo`, so we
> > can remove the need for the first call. In that case, we need to call
> > ri_LoadConstraintInfo in ri_FastPathEndBatch.
>
> Yeah, I think that's fine. Done that way in the attached.
>
> Also, I realized that we could do:
>
> @@ -2937,7 +2937,7 @@ ri_FastPathBatchFlush(RI_FastPathEntry *fpentry,
> Relation fk_rel)
> fk_rel, idx_rel);
> }
> Assert(riinfo->fpmeta);
> - if (riinfo->nkeys == 1)
> + if (riinfo->nkeys == 1 && fpentry->batch_count > 1)
> violation_index = ri_FastPathFlushArray(fpentry, fk_slot, riinfo,
> fk_rel, snapshot, scandesc);
>
> so that the fixed overhead of ri_FastPathFlushArray (allocating
> matched[] array on stack and constructing ArrayType, etc.) is not paid
> unnecessarily for single-row batches.

There's another case in which it is not ok to use FlushArray and that
is if the index AM's amsearcharray is false (should be true in all
cases because the unique index used for PK is always btree). Added an
Assert to that effect next to where SK_SEARCHARRAY is set in
ri_FastPathFlushArray rather than a runtime check in the dispatch
condition.

Patch updated. Also added a comment about invalidation requirement or
lack thereof for RI_FastPathEntry, rename AfterTriggerBatchIsActive()
to simply AfterTriggerIsActive(), fixed the comments in trigger.h
describing the callback mechanism.

Will push tomorrow morning (Friday) barring objections.

--
Thanks, Amit Langote

Attachment Content-Type Size
v17-0001-Batch-FK-rows-and-use-SK_SEARCHARRAY-for-fast-pa.patch application/octet-stream 44.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jie wang 2026-04-02 07:54:18 Re: Small and unlikely overflow hazard in bms_next_member()
Previous Message Chao Li 2026-04-02 07:38:48 Re: PoC: Add condition variable support to WaitEventSetWait()