Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com>
Cc: adoros(at)starfishstorage(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct
Date: 2026-06-01 23:26:51
Message-ID: 2868592.1780356411@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"Matheus Alcantara" <matheusssilv97(at)gmail(dot)com> writes:
> On Thu May 28, 2026 at 12:12 PM -03, Tom Lane wrote:
>> Yeah. The bigger picture though is: if we are re-entrantly calling
>> either a recursive function or a SRF, we should not destroy any of the
>> existing state, nor do we want to replace the function body. The only
>> way to have sane behavior is to keep executing the same function body
>> until the execution instance (recursion level or continued SRF) is
>> done. So these concerns about associated state are only part of the
>> problem.

> I've been exploring the funccache.c approach for plpython. The main
> challenge is that plpython uses SFRM_ValuePerCall for SRFs, whereas
> plpgsql uses SFRM_Materialize. This means plpgsql can simply increment
> use_count at the start of plpgsql_call_handler() and decrement it at the
> end, since all results are produced in a single call. For plpython,
> ExecMakeTableFunctionResult() calls the handler multiple times, with
> use_count returning to zero between calls.

Right. I think what we have to do is maintain the increased use_count
across the whole series of SRF executions and decrement it only once
we're done. That implies that we need some out-of-band mechanism for
decrementing the use_count if the query fails to run the SRF to
completion for whatever reason (error, LIMIT, etc). The first tool
I would reach for is a context reset callback attached to the query's
executor context, but there may be a better answer. Whether we do it
like that or some other way, it might be appropriate to put
infrastructure for it into funccache.c instead of expecting every PL
that wants to use SFRM_ValuePerCall to re-invent this wheel.

> I'm still not sure how to proceed here but It seems like we would need
> some refactoring in plpython to make it work with funccache.

plpython will certainly need some work, but I'm entirely amenable to
also changing funccache if it doesn't support this requirement well.
That module is new as of v18, so it doesn't have much claim to have
a stabilized API yet.

> I've also tried to fix this without funccache, but it seems like we
> would end up implementing something similar anyway.

Yeah, that was my suspicion as well. funccache.c exists because
I realized that SQL-language functions (executor/functions.c) were
going to need logic that plpgsql had had for years.

Actually ... if memory serves, SQL-language functions use ValuePerCall
mode, so there probably already is a solution to this embedded in
functions.c. Did you look at that?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Никита Калинин 2026-06-02 02:39:26 Re: BUG #19500: pgrepack logical decoding plugin can crash assert builds via SQL decoding API
Previous Message Matheus Alcantara 2026-06-01 22:14:34 Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct