| From: | Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com> |
|---|---|
| To: | zlh21343(at)163(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz> |
| Subject: | Re: BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions = |
| Date: | 2026-06-15 09:14:06 |
| Message-ID: | CAJTYsWVqtWF+KJUdoKrzUM9QPQA5qD225AoeBeN7cwT4V1qd6A@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hi,
On Sun, 14 Jun 2026 at 22:37, PG Bug reporting form <noreply(at)postgresql(dot)org>
wrote:
> The following bug has been logged on the website:
>
> Bug reference: 19520
> Logged by: zhanglihui
> Email address: zlh21343(at)163(dot)com
> PostgreSQL version: 19beta1
> Operating system: Ubuntu 25.04
> Description:
>
> === Configuration ===
> Enabled extensions: pg_stat_statements
> postgresql.conf:
> shared_preload_libraries = 'pg_stat_statements'
> track_functions = 'all'
>
> === Problem Description ===
> PostgreSQL server throws PANIC under high concurrent create, CALL and DROP
> of stored procedures.
> This issue **only reproduces when pg_stat_statements is enabled and
> track_functions = all**.
> It cannot be triggered if pg_stat_statements is disabled or track_functions
> is set to none/pl.
>
> === Run Steps ===
> javac -cp postgresql-42.7.5.jar -d out src/ConcurrentSqlTest.java
> src/ProcCrashReprodure.java
> # Terminal 1 — Mixed DDL/DML load test (40 threads, 10000 iterations each)
> java -cp out:postgresql-42.7.5.jar ConcurrentSqlTest
>
> # Terminal 2 — Pure CALL load test (20 threads, infinite loop)
> java -cp out:postgresql-42.7.5.jar ProcCrashReprodure
> # Note: If no PANIC log or core dump is generated after the execution of
> Terminal 1, please re-run the command repeatedly until the issue occurs.
>
> PANIC log:
> postgresql-2026-06-14_235441.log:2026-06-14 23:54:41.949 CST [691761]
> PANIC:
> XX000: cannot abort transaction 4166281, it was already committed
> postgresql-2026-06-14_235931.log:2026-06-14 23:59:31.556 CST [696980]
> PANIC:
> XX000: cannot abort transaction 4641943, it was already committed
>
> (gdb) bt
> #0 __pthread_kill_implementation (threadid=<optimized out>, signo=6,
> no_tid=0) at ./nptl/pthread_kill.c:44
> #1 __pthread_kill_internal (threadid=<optimized out>, signo=6) at
> ./nptl/pthread_kill.c:89
> #2 __GI___pthread_kill (threadid=<optimized out>, signo=signo(at)entry=6) at
> ./nptl/pthread_kill.c:100
> #3 0x000073e62ec4579e in __GI_raise (sig=sig(at)entry=6) at
> ../sysdeps/posix/raise.c:26
> #4 0x000073e62ec288cd in __GI_abort () at ./stdlib/abort.c:73
> #5 0x000056aeb574138c in errfinish (filename=0x56aeb57ff545 "xact.c",
> lineno=1835, funcname=0x56aeb5800d00 <__func__.26>
> "RecordTransactionAbort")
> at elog.c:621
> #6 0x000056aeb4faae67 in RecordTransactionAbort (isSubXact=false) at
> xact.c:1835
> #7 0x000056aeb4fac19f in AbortTransaction () at xact.c:2982
> #8 0x000056aeb4facc22 in AbortCurrentTransactionInternal () at xact.c:3553
> #9 0x000056aeb4facb93 in AbortCurrentTransaction () at xact.c:3507
> #10 0x000056aeb5525552 in PostgresMain (dbname=0x56aedecf30d0 "postgres",
> username=0x56aedecf30b0 "zlh_user") at postgres.c:4539
> #11 0x000056aeb551b59c in BackendMain (startup_data=0x7fffb9501000,
> startup_data_len=24) at backend_startup.c:124
> #12 0x000056aeb5405686 in postmaster_child_launch (child_type=B_BACKEND,
> child_slot=24, startup_data=0x7fffb9501000, startup_data_len=24,
> client_sock=0x7fffb9501060) at launch_backend.c:268
> #13 0x000056aeb540c11b in BackendStartup (client_sock=0x7fffb9501060) at
> postmaster.c:3627
> #14 0x000056aeb540969f in ServerLoop () at postmaster.c:1728
> #15 0x000056aeb5408f7c in PostmasterMain (argc=1, argv=0x56aedeca1430) at
> postmaster.c:1415
> #16 0x000056aeb528ce51 in main (argc=1, argv=0x56aedeca1430) at main.c:231
>
Thanks for the report! I'm unsure if this has already been reported or
not.
I looked into this the last day, I could reproduce it locally. Rather
than the Java harness I used ~60 concurrent psql clients looping DROP /
CREATE OR REPLACE / CALL of the same empty plpgsql procedure
(track_functions=all, pg_stat_statements loaded); here it PANICs within
a few seconds.
Just before the PANIC the failing backend logs:
ERROR: trying to drop stats entry already dropped: kind=function ...
WARNING: AbortTransaction while in COMMIT state
PANIC: cannot abort transaction xxx, it was already committed
So it looks like a function's shared stats entry gets dropped twice:
once out-of-band from pgstat_init_function_usage() when a concurrent
CALL notices the function is gone, and once from the transactional drop
at DROP time. When the latter loses the race it runs from
AtEOXact_PgStat(), past the commit record, so the "already dropped"
elog() in pgstat_drop_entry_internal() becomes the PANIC.
The two droppers and the guard all seem to date back to PG 15
(5891c7a8ed8f). I guess the "drop exactly once" assumption behind that
guard doesn't really hold for function stats, where two independent
droppers are legitimate.
I've added Andres and Michael on the thread, since they have worked on
this in the past, for their input.
Regards,
Ayush
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | PG Bug reporting form | 2026-06-14 16:05:27 | BUG #19520: PANIC when concurrently manipulating stored procedures with pg_stat_statements and track_functions = |