Re: Eliminating SPI / SQL from some RI triggers - take 3

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Evan Montgomery-Recht <montge(at)mianetworks(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
Date: 2026-04-08 14:26:21
Message-ID: CA+HiwqFt4NGTNk7BinOsHHM48E9zGAa852vCfGoSe1bbL=JNFQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 8, 2026 at 6:58 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Wed, Apr 8, 2026 at 10:23 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Tue, Apr 7, 2026 at 10:00 PM Evan Montgomery-Recht
> > <montge(at)mianetworks(dot)net> wrote:
> > > The patch also adds a test module (test_spi_func) with a C function
> > > that executes SQL via SPI_connect/SPI_execute/SPI_finish, since this
> > > crash cannot be triggered from PL/pgSQL. The test exercises the
> > > C-level SPI INSERT with multiple FK constraints, FK violations, and
> > > nested PL/pgSQL-calls-C-SPI (matching the PostGIS call pattern).
>
> I applied only the test module changes and it passes (without
> crashing) even without your proposed fix. It seems that's because the
> C function in test_spi_func calling SPI is using the same resource
> owner as the parent SELECT. I think you'd need to create a resource
> owner manually in the spi_exec() C function to reproduce the crash, as
> done in the attached 0001, which contains the src/test changes
> extracted from your patch modified as described, including renaming
> the C function to spi_exec_sql().
>
> Also, the test cases that call spi_exec() (_sql()) directly from a
> SELECT don't actually exercise the crash path because there is no
> outer trigger-firing loop active. query_depth is 0 inside the inner
> SPI's AfterTriggerEndQuery, so the old guard wouldn't suppress the
> callback there anyway. The critical case requires spi_exec_sql() to be
> called from inside an AFTER trigger, where query_depth > 0 causes the
> guard to defer the callback past the inner resource owner's lifetime.
> I've added that test case. I kept your original test cases as they
> still provide useful coverage of C-level SPI FK behavior even if they
> don't exercise the crash path specifically. Maybe your original
> PostGIS test suite that hit the crash did have the right structure,
> but that's not reflected in the patch as far as I can tell.
>
> I've also renamed the module to test_spi_resowner to better reflect
> what it's about.
>
> For the fix, I have a different proposal. As you observed, the
> query_depth > 0 early return in FireAfterTriggerBatchCallbacks() means
> that the nested SPI's callbacks get called under the outer resource
> owner, which may not be the same as the one that SPI used. I think it
> was a mistake to have that early return in the first place. Instead we
> could remember for each callback what firing level it should be called
> at, so the nested SPI's callbacks fire before returning to the parent
> level and parent-level callbacks fire when the parent level completes.
> I have implemented that in the attached 0002 along with transaction
> boundary cleanup of callbacks, which passes the check-world for me,
> but I'll need to stare some more at it before committing.
>
> Let me know if this also fixes your own in-house test suite or if you
> have any other suggestions or if you think I am missing something.

One more cleanup patch attached as 0003: afterTriggerFiringDepth was
added by commit 5c54c3ed1 as a file-static variable, which in
hindsight should have been a field in AfterTriggersData alongside the
other per-transaction after-trigger state. This patch makes that
correction.

One alternative design worth considering for 0002: storing
batch_callbacks per query level in AfterTriggersQueryData rather than
as a single list in AfterTriggersData, so callbacks naturally live at
the query level where they were registered and get cleaned up with
AfterTriggerFreeQuery on abort. Deferred constraints still need a
top-level list in AfterTriggersData since they fire outside any query
level. FireAfterTriggerBatchCallbacks() takes a list parameter and the
caller passes either the query-level or top-level list as appropriate.
This eliminates the need for firing_depth-matched firing entirely. I
did that in 0004. I think I like it over 0002. Will look more
closely tomorrow morning.

--
Thanks, Amit Langote

Attachment Content-Type Size
v3-0004-Store-batch-callbacks-at-the-appropriate-level-ra.patch application/octet-stream 7.8 KB
v3-0002-Fix-RI-fast-path-crash-under-nested-C-level-SPI.patch application/octet-stream 5.6 KB
v3-0003-Move-afterTriggerFiringDepth-into-AfterTriggersDa.patch application/octet-stream 5.6 KB
v3-0001-Modified-test-suite-from-Evan-s-patch.patch application/octet-stream 17.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2026-04-08 14:34:14 Re: feature freeze for v19 begins April 8th at 12:00 UTC
Previous Message Jim Jones 2026-04-08 14:04:35 Re: Fix bug with accessing to temporary tables of other sessions