Re: Eliminating SPI / SQL from some RI triggers - take 3

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Evan Montgomery-Recht <montge(at)mianetworks(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
Date: 2026-04-09 07:39:44
Message-ID: 8561287B-19F1-421E-959D-99F4593CFF54@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Apr 8, 2026, at 22:26, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> On Wed, Apr 8, 2026 at 6:58 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> On Wed, Apr 8, 2026 at 10:23 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>>> On Tue, Apr 7, 2026 at 10:00 PM Evan Montgomery-Recht
>>> <montge(at)mianetworks(dot)net> wrote:
>>>> The patch also adds a test module (test_spi_func) with a C function
>>>> that executes SQL via SPI_connect/SPI_execute/SPI_finish, since this
>>>> crash cannot be triggered from PL/pgSQL. The test exercises the
>>>> C-level SPI INSERT with multiple FK constraints, FK violations, and
>>>> nested PL/pgSQL-calls-C-SPI (matching the PostGIS call pattern).
>>
>> I applied only the test module changes and it passes (without
>> crashing) even without your proposed fix. It seems that's because the
>> C function in test_spi_func calling SPI is using the same resource
>> owner as the parent SELECT. I think you'd need to create a resource
>> owner manually in the spi_exec() C function to reproduce the crash, as
>> done in the attached 0001, which contains the src/test changes
>> extracted from your patch modified as described, including renaming
>> the C function to spi_exec_sql().
>>
>> Also, the test cases that call spi_exec() (_sql()) directly from a
>> SELECT don't actually exercise the crash path because there is no
>> outer trigger-firing loop active. query_depth is 0 inside the inner
>> SPI's AfterTriggerEndQuery, so the old guard wouldn't suppress the
>> callback there anyway. The critical case requires spi_exec_sql() to be
>> called from inside an AFTER trigger, where query_depth > 0 causes the
>> guard to defer the callback past the inner resource owner's lifetime.
>> I've added that test case. I kept your original test cases as they
>> still provide useful coverage of C-level SPI FK behavior even if they
>> don't exercise the crash path specifically. Maybe your original
>> PostGIS test suite that hit the crash did have the right structure,
>> but that's not reflected in the patch as far as I can tell.
>>
>> I've also renamed the module to test_spi_resowner to better reflect
>> what it's about.
>>
>> For the fix, I have a different proposal. As you observed, the
>> query_depth > 0 early return in FireAfterTriggerBatchCallbacks() means
>> that the nested SPI's callbacks get called under the outer resource
>> owner, which may not be the same as the one that SPI used. I think it
>> was a mistake to have that early return in the first place. Instead we
>> could remember for each callback what firing level it should be called
>> at, so the nested SPI's callbacks fire before returning to the parent
>> level and parent-level callbacks fire when the parent level completes.
>> I have implemented that in the attached 0002 along with transaction
>> boundary cleanup of callbacks, which passes the check-world for me,
>> but I'll need to stare some more at it before committing.
>>
>> Let me know if this also fixes your own in-house test suite or if you
>> have any other suggestions or if you think I am missing something.
>
> One more cleanup patch attached as 0003: afterTriggerFiringDepth was
> added by commit 5c54c3ed1 as a file-static variable, which in
> hindsight should have been a field in AfterTriggersData alongside the
> other per-transaction after-trigger state. This patch makes that
> correction.
>
> One alternative design worth considering for 0002: storing
> batch_callbacks per query level in AfterTriggersQueryData rather than
> as a single list in AfterTriggersData, so callbacks naturally live at
> the query level where they were registered and get cleaned up with
> AfterTriggerFreeQuery on abort. Deferred constraints still need a
> top-level list in AfterTriggersData since they fire outside any query
> level. FireAfterTriggerBatchCallbacks() takes a list parameter and the
> caller passes either the query-level or top-level list as appropriate.
> This eliminates the need for firing_depth-matched firing entirely. I
> did that in 0004. I think I like it over 0002. Will look more
> closely tomorrow morning.
>
>
> --
> Thanks, Amit Langote
> <v3-0004-Store-batch-callbacks-at-the-appropriate-level-ra.patch><v3-0002-Fix-RI-fast-path-crash-under-nested-C-level-SPI.patch><v3-0003-Move-afterTriggerFiringDepth-into-AfterTriggersDa.patch><v3-0001-Modified-test-suite-from-Evan-s-patch.patch>

A few comments on v3:

1 - 0002
```
static void
FireAfterTriggerBatchCallbacks(void)
{
+ List *remaining = NIL;
+ List *to_fire = NIL;
ListCell *lc;

- if (afterTriggers.query_depth > 0)
- return;
+ /* remaining and to_fire lists must survive until callbacks complete */
+ MemoryContext oldcxt = MemoryContextSwitchTo(TopTransactionContext);
```

I think remaining and to_fire should stay in the same context of afterTriggers.batch_callbacks, so instead of hard coding TopTransactionContext, we can use GetMemoryChunkContext(afterTriggers.batch_callbacks), which makes the intention explicit.

2 - 0004, I noticed one potential problem, although I am not sure whether it can really happen in practice. This version stores callback items at the individual query depth, and FireAfterTriggerBatchCallbacks() now iterates the callback list for that depth and invokes each callback directly. My concern is that if one of those callbacks needs to register a new callback, that would append a new item to the same list while it is being iterated. That seems unsafe to me, because list append may create a new list structure underneath. If that happens, we may end up modifying the list being traversed, which does not look safe.

This problem doesn’t exist in 0002, because 0002 splits afterTriggers.batch_callbacks into remaining and to_fire, and reset afterTriggers.batch_callbacks = remaining before running callbacks. But the problem is, if a callback registers a new callback, the new callback goes to afterTriggers.batch_callbacks, so it won’t get executed.

From this perspective, I would assume a callback should not be allowed to register a new callback. Can you please help confirm?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dapeng Wang 2026-04-09 07:59:46 Re: create table like including storage parameter
Previous Message Xuneng Zhou 2026-04-09 07:33:08 Re: Optimize SnapBuild by maintaining committed.xip in sorted order