Re: Segfault due to NULL ParamExecData value

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault due to NULL ParamExecData value
Date: 2025-12-05 13:33:26
Message-ID: CAO6_XqrLc0NbO0en=CQ=xibPHugRUsoh9WuTU4LFTD1268E3Tg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Dec 4, 2025 at 4:35 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I'm not volunteering to look into this without a reproducer.

However, seeing that EvalPlanQual is in the stack trace,
> my gut feeling is that the EPQ mechanism is somehow mis-managing
> output Params for InitPlans. I vaguely recall some definitional
> issues around whether it'd be okay to pass down already-computed
> InitPlan results into the EPQ sub-evaluation, or whether we should
> force the sub-evaluation to do those afresh. It was awhile back
> and I don't remember what was decided.
>

That sounds like an interesting lead. The impacted cluster definitely had a
lot of long transactions updating the same rows with the occasional
deadlocks.

Don't suppose you can try to reproduce this on something newer
> than 14.17?
>

That would be hard. On the production cluster, we've stopped the segfaults
by rewriting the query (faster execution time, so less recheck and EPQ?)
and I don't have much leeway to experiment on those.

I'm currently working on a backup of the cluster trying to redo the same
queries using 14.17, since the issue was happening with this version. If I
manage to have a reproducer, I will check newer versions. I will focus on
triggering the EPQ since that looks like a good lead.

Regards,
Anthonin Bonnefoy

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Kirill Reshke 2025-12-05 13:36:27 Re: BUG #19345: MemoryContextSizeFailure after upgrade 14.11 to 17.7 in stored procedure
Previous Message Dean Rasheed 2025-12-05 10:17:15 Re: BUG #19340: Wrong result from CORR() function