Segmentation fault on proc exit after dshash_find_or_insert

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: Segmentation fault on proc exit after dshash_find_or_insert
Date: 2025-11-21 11:45:35
Message-ID: CAH2L28uSvyiosL+kaic9249jRVoQiQF6JOnaCitKFq=xiFzX3g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

If a process encounters a FATAL error after acquiring a dshash lock but
before releasing it,
and it is not within a transaction, it can lead to a segmentation fault.

The FATAL error causes the backend to exit, triggering proc_exit() and
similar functions.
In the absence of a transaction, LWLockReleaseAll() is delayed until
ProcKill. ProcKill is
an on_shmem_exit callback, and dsm_backend_shutdown() is called before any
on_shmem_exit callbacks are invoked.
Consequently, if a dshash lock was acquired before the FATAL error
occurred, the lock
will only be released after dsm_backend_shutdown() detaches the DSM segment
containing
the lock, resulting in a segmentation fault.

Please find a reproducer attached. I have modified the test_dsm_registry
module to create
a background worker that does nothing but throws a FATAL error after
acquiring the dshash lock.
The reason this must be executed in the background worker is to ensure it
runs without a transaction.

To trigger the segmentation fault, apply the 0001-Reproducer* patch, run
make install in the
test_dsm_registry module, specify test_dsm_registry as
shared_preload_libraries in postgresql.conf,
and start the server.

Please find attached a fix to call LWLockReleaseAll() early in the
shmem_exit() routine. This ensures
that the dshash lock is released before dsm_backend_shutdown() is called.
This will also ensure that
any subsequent callbacks invoked in shmem_exit() will not fail to acquire
any lock.

Please see the backtrace below.

```
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055a7515af56c in pg_atomic_fetch_sub_u32_impl (ptr=0x7f92c4b334f4,
sub_=262144)
at ../../../../src/include/port/atomics/generic-gcc.h:218
218 return __sync_fetch_and_sub(&ptr->value, sub_);
(gdb) bt
#0 0x000055a7515af56c in pg_atomic_fetch_sub_u32_impl (ptr=0x7f92c4b334f4,
sub_=262144)
at ../../../../src/include/port/atomics/generic-gcc.h:218
#1 0x000055a7515af625 in pg_atomic_sub_fetch_u32_impl (ptr=0x7f92c4b334f4,
sub_=262144)
at ../../../../src/include/port/atomics/generic.h:232
#2 0x000055a7515af709 in pg_atomic_sub_fetch_u32 (ptr=0x7f92c4b334f4,
sub_=262144)
at ../../../../src/include/port/atomics.h:441
#3 0x000055a7515b1583 in LWLockReleaseInternal (lock=0x7f92c4b334f0,
mode=LW_EXCLUSIVE) at lwlock.c:1840
#4 0x000055a7515b1638 in LWLockRelease (lock=0x7f92c4b334f0) at
lwlock.c:1902
#5 0x000055a7515b16e9 in LWLockReleaseAll () at lwlock.c:1951
#6 0x000055a7515ba63d in ProcKill (code=1, arg=0) at proc.c:953
#7 0x000055a7515913af in shmem_exit (code=1) at ipc.c:276
#8 0x000055a75159119b in proc_exit_prepare (code=1) at ipc.c:198
#9 0x000055a7515910df in proc_exit (code=1) at ipc.c:111
#10 0x000055a7517be71d in errfinish (filename=0x7f92ce41d062
"test_dsm_registry.c", lineno=187,
funcname=0x7f92ce41d160 <__func__.0> "TestDSMRegistryMain") at
elog.c:596
#11 0x00007f92ce41ca62 in TestDSMRegistryMain (main_arg=0) at
test_dsm_registry.c:187
#12 0x000055a7514db00c in BackgroundWorkerMain
(startup_data=0x55a752dd8028, startup_data_len=1472)
at bgworker.c:846
#13 0x000055a7514de1e8 in postmaster_child_launch (child_type=B_BG_WORKER,
child_slot=239,
startup_data=0x55a752dd8028, startup_data_len=1472, client_sock=0x0) at
launch_backend.c:268
#14 0x000055a7514e530d in StartBackgroundWorker (rw=0x55a752dd8028) at
postmaster.c:4168
#15 0x000055a7514e55a4 in maybe_start_bgworkers () at postmaster.c:4334
#16 0x000055a7514e4200 in LaunchMissingBackgroundProcesses () at
postmaster.c:3408
#17 0x000055a7514e205b in ServerLoop () at postmaster.c:1728
#18 0x000055a7514e18b0 in PostmasterMain (argc=3, argv=0x55a752dd0e70) at
postmaster.c:1403
#19 0x000055a75138eead in main (argc=3, argv=0x55a752dd0e70) at main.c:231
```

Thank you,
Rahila Syed

Attachment Content-Type Size
0001-Reproducer-segmentation-fault-dshash.patch application/octet-stream 2.0 KB
0001-Fix-the-seg-fault-during-proc-exit.patch application/octet-stream 1.2 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-11-21 11:53:52 Re: Add os_page_num to pg_buffercache
Previous Message Amul Sul 2025-11-21 11:44:26 Re: pg_waldump: support decoding of WAL inside tarfile