Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =

From: Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: zlh21343(at)163(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>, Sami Imseih <samimseih(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =
Date: 2026-06-17 08:19:23
Message-ID: CAJTYsWWJEchycUwkt6=Hk3j3ROOJKVp2n5yaU7JNqwqOmyoEMA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On Wed, 17 Jun 2026 at 09:45, Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Tue, Jun 16, 2026 at 08:24:51AM +0900, Michael Paquier wrote:
> > On Mon, Jun 15, 2026 at 02:44:06PM +0530, Ayush Tiwari wrote:
> >> I've added Andres and Michael on the thread, since they have worked on
> >> this in the past, for their input.
> >
> > Thanks for the poke. I have marked this thread as something to look
> > at, but was not able to get back to it. Will investigate..
>
> As far as I can see, pgss is not really a requirement. Your case is
> taking advantage of the module introducing more slowness to enlarge
> the reproduction window. Now saying that pgss being slow is a good
> thing, it's bad, but it helps here. I've tried to reproduce in three
> environments, only my mac is able to get something, because it's
> slower I guess..
>

Yeah, you are right, pgss is not a requirement, it just
makes the delay broader.

> Attached is a script able to reproduce the issue in bash, courtesy of
> Claude because java and I sum up to a value very close to 0, see
> test_bug19520.txt. The trick of the script is the same as your
> scenario, with two concurrent workloads:
> - One with DROP PROC/CREATE PROC/CALL.
> - One with CALL
>
> I had much more success after adding two sleeps to enlarge the
> conflict window, see also the sleep.patch attached, for reference.
>
> Finally attached is a patch, where I'd like to propose the
> introduction of a path in pgstat_drop_entry() to make the routine able
> to accept double drops.
>

I applied the patch on HEAD and ran my psql harness against it (~60
clients looping DROP / CREATE OR REPLACE / CALL, track_functions=all,
pgss loaded). Unpatched it PANICs within seconds; with the patch it
stayed up for a ~3 minute run, with the out-of-band drop path firing
several thousand times. So it clearly closes the hole here.

> The big comment within pgstat_init_function_usage() documents why it
> does its stuff for track_functions, so I was wondering if we should
> enforce the same double-drop-acceptance rule for all the callers
> everybody, but I also see a point in the correctness, by allowing the
> caller to complain if we try to do double drops but error on them,
> pointing to a programming error. Note that
> pgstat_drop_entry_internal() is not touched on purpose, to keep the
> database-level scans as they are, with double-drops forbidden.
>
> This patch is very close to what Sami has posted on his PGSS thread,
> v3-0002, using a missing_ok instead of a skip_dropped:
>
> https://www.postgresql.org/message-id/CAA5RZ0uoxiQ2_=xHGRnyc4WdM9aR0fzdMhBubnw97po==--yGQ@mail.gmail.com
> I didn't suspect that we would need something like that for a
> backpatch, but well.
>
> I'm adding Sami in CC in case he wishes to comment on this patch, and
> Horiguchi-san as this area of the code concerns him.
>
> Thoughts or comments welcome.
>

A couple of things which I'm not clear about (these are not blockers
just questions for my understanding):

- With the check moved into the wrapper, pgstat_drop_entry_internal()
still keeps its own "already dropped" elog(). Every path into
_internal now seems to guarantee the entry isn't dropped, so
_internal's copy looks unreachable after the patch
,and it's the one with the richer refcount/generation detail. Was
the idea to leave it as a backstop, or would folding the handling into
one place (or making _internal's an Assert) be cleaner?

- In the missing_ok path the wrapper returns true, so the post-commit
caller skips the not_freed_count++/GC request that a "real" not-freed
drop would do. That seems harmless since the entry self-heals
but was returning true there a deliberate choice over mirroring
the not-freed/false path? I need to take a look again at this, maybe
I missed something.

Regards,
Ayush

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Langote 2026-06-17 08:27:25 Re: BUG #19458: OOM killer in jsonb_path_exists_opr (@?) with malformed JSONPath containing non-existent variables
Previous Message Michael Paquier 2026-06-17 04:15:13 Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions =