| From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
|---|---|
| To: | Evan Montgomery-Recht <montge(at)mianetworks(dot)net> |
| Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Eliminating SPI / SQL from some RI triggers - take 3 |
| Date: | 2026-04-08 14:26:21 |
| Message-ID: | CA+HiwqFt4NGTNk7BinOsHHM48E9zGAa852vCfGoSe1bbL=JNFQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Apr 8, 2026 at 6:58 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Wed, Apr 8, 2026 at 10:23 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Tue, Apr 7, 2026 at 10:00 PM Evan Montgomery-Recht
> > <montge(at)mianetworks(dot)net> wrote:
> > > The patch also adds a test module (test_spi_func) with a C function
> > > that executes SQL via SPI_connect/SPI_execute/SPI_finish, since this
> > > crash cannot be triggered from PL/pgSQL. The test exercises the
> > > C-level SPI INSERT with multiple FK constraints, FK violations, and
> > > nested PL/pgSQL-calls-C-SPI (matching the PostGIS call pattern).
>
> I applied only the test module changes and it passes (without
> crashing) even without your proposed fix. It seems that's because the
> C function in test_spi_func calling SPI is using the same resource
> owner as the parent SELECT. I think you'd need to create a resource
> owner manually in the spi_exec() C function to reproduce the crash, as
> done in the attached 0001, which contains the src/test changes
> extracted from your patch modified as described, including renaming
> the C function to spi_exec_sql().
>
> Also, the test cases that call spi_exec() (_sql()) directly from a
> SELECT don't actually exercise the crash path because there is no
> outer trigger-firing loop active. query_depth is 0 inside the inner
> SPI's AfterTriggerEndQuery, so the old guard wouldn't suppress the
> callback there anyway. The critical case requires spi_exec_sql() to be
> called from inside an AFTER trigger, where query_depth > 0 causes the
> guard to defer the callback past the inner resource owner's lifetime.
> I've added that test case. I kept your original test cases as they
> still provide useful coverage of C-level SPI FK behavior even if they
> don't exercise the crash path specifically. Maybe your original
> PostGIS test suite that hit the crash did have the right structure,
> but that's not reflected in the patch as far as I can tell.
>
> I've also renamed the module to test_spi_resowner to better reflect
> what it's about.
>
> For the fix, I have a different proposal. As you observed, the
> query_depth > 0 early return in FireAfterTriggerBatchCallbacks() means
> that the nested SPI's callbacks get called under the outer resource
> owner, which may not be the same as the one that SPI used. I think it
> was a mistake to have that early return in the first place. Instead we
> could remember for each callback what firing level it should be called
> at, so the nested SPI's callbacks fire before returning to the parent
> level and parent-level callbacks fire when the parent level completes.
> I have implemented that in the attached 0002 along with transaction
> boundary cleanup of callbacks, which passes the check-world for me,
> but I'll need to stare some more at it before committing.
>
> Let me know if this also fixes your own in-house test suite or if you
> have any other suggestions or if you think I am missing something.
One more cleanup patch attached as 0003: afterTriggerFiringDepth was
added by commit 5c54c3ed1 as a file-static variable, which in
hindsight should have been a field in AfterTriggersData alongside the
other per-transaction after-trigger state. This patch makes that
correction.
One alternative design worth considering for 0002: storing
batch_callbacks per query level in AfterTriggersQueryData rather than
as a single list in AfterTriggersData, so callbacks naturally live at
the query level where they were registered and get cleaned up with
AfterTriggerFreeQuery on abort. Deferred constraints still need a
top-level list in AfterTriggersData since they fire outside any query
level. FireAfterTriggerBatchCallbacks() takes a list parameter and the
caller passes either the query-level or top-level list as appropriate.
This eliminates the need for firing_depth-matched firing entirely. I
did that in 0004. I think I like it over 0002. Will look more
closely tomorrow morning.
--
Thanks, Amit Langote
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0004-Store-batch-callbacks-at-the-appropriate-level-ra.patch | application/octet-stream | 7.8 KB |
| v3-0002-Fix-RI-fast-path-crash-under-nested-C-level-SPI.patch | application/octet-stream | 5.6 KB |
| v3-0003-Move-afterTriggerFiringDepth-into-AfterTriggersDa.patch | application/octet-stream | 5.6 KB |
| v3-0001-Modified-test-suite-from-Evan-s-patch.patch | application/octet-stream | 17.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2026-04-08 14:34:14 | Re: feature freeze for v19 begins April 8th at 12:00 UTC |
| Previous Message | Jim Jones | 2026-04-08 14:04:35 | Re: Fix bug with accessing to temporary tables of other sessions |