| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com> |
| Cc: | adoros(at)starfishstorage(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: BUG #19480: PL/Python SRF crashes (SIGSEGV) when function is replaced mid-iteration: use-after-free in PLy_funct |
| Date: | 2026-05-28 15:12:26 |
| Message-ID: | 982975.1779981146@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
"Matheus Alcantara" <matheusssilv97(at)gmail(dot)com> writes:
> On Fri May 15, 2026 at 8:11 AM -03, PG Bug reporting form wrote:
>> The root cause is that srfstate->savedargs is tied to proc->mcxt (which can
>> be deleted at any per-call boundary) rather than to
>> funcctx->multi_call_memory_ctx (which lives for the entire SRF lifetime).
> Option A seems to fix the issue (see attached patch) but I've found
> another issue while playing with this that I think it's related:
> ...
> This is because when PLy_procedure_delete() is executed on
> PLy_procedure_get() it also destroy information related with recursive
> functions, such as "calldepth", "argstack" and "globals" which cause the
> assert failure Assert(proc->calldepth > 0) on PLy_global_args_pop() when
> it's executed on PG_CATCH block on PLy_exec_function() or EXC_BAD_ACCESS
> when accessing "argstack" or "globals".
Yeah. The bigger picture though is: if we are re-entrantly calling
either a recursive function or a SRF, we should not destroy any of the
existing state, nor do we want to replace the function body. The only
way to have sane behavior is to keep executing the same function body
until the execution instance (recursion level or continued SRF) is
done. So these concerns about associated state are only part of the
problem.
plpgsql ran into this years ago, and its solution has been to maintain
a reference count on each function parsetree and not destroy an
obsoleted parsetree till the reference count goes to zero. I've had
in the back of my head that the other PLs need to do likewise, but it
hasn't gotten to the front of the to-do list, mainly because the other
PLs are much less used and so field complaints about this have been
rare. I had hoped also that the language interpreters underlying the
other PLs might solve some of this for us, but it's unclear to what
extent they help. Certainly it's not cool to be clobbering our own
execution state that's outside the language interpreter.
We might want to go as far as converting the other PLs to use the
utils/cache/funccache.c infrastructure, but perhaps there is a
less invasive fix. Certainly, a fix based on funccache.c could not
be back-patched. (On the other hand, given the rarity of complaints,
perhaps a HEAD-only fix is acceptable.)
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tender Wang | 2026-05-29 02:03:50 | Re: BUG #19493: Assertion failure in pg_plan_advice with EXISTS subquery and DO_NOT_SCAN advice |
| Previous Message | PG Bug reporting form | 2026-05-28 14:54:26 | BUG #19500: pgrepack logical decoding plugin can crash assert builds via SQL decoding API |