Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Dmitry Fomin <fomin(dot)list(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)
Date: 2025-10-20 13:48:40
Message-ID: CA+HiwqH8p+6f+W_tk5am6zaCpmxX5mSRmjA_egt7N4yu2yMT0Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Oct 17, 2025 at 7:36 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Fri, Oct 17, 2025 at 6:08 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > > Thanks for the detailed report. It seems to have been caused by
> > > ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo
> > > isn't set for this partition which causes ExecGetChildToRootMap() to
> > > think no translation is required.
> >
> > The problem is with the ResultRelInfo caching that's in
> > ExecGetTriggerResultRel(). The code there tries looking into the
> > estate's es_opened_result_relations, es_tuple_routing_result_relations
> > and es_trig_target_relations Lists to see if the ResultRelInfo was
> > created before. In the problem case the ResultRelInfo is found in
> > es_trig_target_relations and unfortunately it's been set up by some
> > code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo.
> > This means when ExecGetTriggerResultRel() is called again this time
> > passing the correct rootRelInfo, the cached one that has the NULL
> > ri_RootResultRelInfo is found and returned.
> >
> > This results in the ExecGetChildToRootMap() code doing the wrong thing
> > because it sees a NULL ri_RootResultRelInfo therefore does not
> > translate the slot into the slow format of the partitioned table.
> >
> > I've attached a patch which fixes the problem. I'm just not sure if
> > it's the right fix for the problem. I suspect the real problem is down
> > to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo
> > in the first place. I just don't see a good way to figure out what the
> > parent table should be so we know to create a parent ResultRelInfo as
> > the trigger that is firing is for the partition, not the partitioned
> > table. I don't see any way to figure out that the trigger is being
> > fired because it's cascading an update of its parent partitioned
> > table... At a guess, it feels like there might be some fields missing
> > in AfterTriggerShared to figure this out.
> >
> > Any thoughts on this Amit?
>
> Thanks for the off-list heads-up, David.
>
> I'll need to play around with Dmitry's test case a bit before I can be
> sure, but I agree with your suspicion -- something's not right if a
> result rel without a root rel set ends up in a path where tuple
> conversion matters.

What seems to be happening is that trigger firings of the FK child
tables run under the parent table’s UPDATE EState, because the child
table’s trigger events are queued under SPI in ri_triggers.c. That
code doesn’t execute the child’s triggers directly -- it defers them
to run later under the parent’s context. As a result, the EState used
at execution time lacks child ResultRelInfos with a root rel set, and
new ones end up being created and added to es_trig_target_relations.

In this case, the first event that creates a rootless entry in
es_trig_result_relations is the INSERT side of the cross-partition
update on stages, fired for its FK into pipelines. The later UPDATE
event is for builds, whose FK references stages. When that UPDATE
runs, ExecGetTriggerResultRel() looks up the destination result rel
for stages and finds the rootless one that was built for the earlier
INSERT. Because its ri_RootResultRelInfo is NULL,
ExecGetChildToRootMap() skips tuple translation and crashes. That
seems to explain the “something off” we both suspected.

For the fix, I tried a slightly different approach from your patch.
Instead of updating the cached entry to always set
ri_RootResultRelInfo = rootRelInfo on a cache hit, I made
ExecGetTriggerResultRel() avoid reusing a rootless ResultRelInfo when
a rooted one is expected. Concretely, I added assertions on the two
primary lists and tightened the es_trig_target_relations lookup to
only return an entry when ri_RootResultRelInfo == rootRelInfo or the
caller’s rootRelInfo is NULL. That prevents the rootless destination
built for the INSERT on stages -> pipelines from being reused for the
later UPDATE from builds -> stages.

This seems a bit safer since it keeps cached entries immutable and
surfaces mismatches via asserts rather than mutating shared state.

Thoughts?

--
Thanks, Amit Langote

Attachment Content-Type Size
v2-0001-Fix-incorrect-reuse-of-ResultRelInfo-in-trigger-e.patch application/octet-stream 10.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Marco Boeringa 2025-10-20 14:42:32 Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Previous Message Marco Boeringa 2025-10-20 11:22:45 Re: Potential "AIO / io workers" inter-worker locking issue in PG18?