From cbac8a3b949a893f530150a1da212bc67a46af00 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 28 Apr 2026 12:21:21 -0700
Subject: [PATCH v2_18] Fix race between ProcSignalInit() and
 EmitProcSignalBarrier().

Previously, ProcSignalInit() read the global barrier generation before
publishing its PID into the pss_pid slot. This created a race
condition: a process could initialize its local generation with an
older global value, while a concurrent EmitProcSignalBarrier() might
skip that process because its pss_pid was still zero. This resulted in
WaitForProcSignalBarrier() hanging indefinitely.

This commit fixes the issue by publishing pss_pid before reading
psh_barrierGeneration, with a memory barrier in between so that the
store is globally visible first. A concurrent EmitProcSignalBarrier()
then either observes the published PID and signals this slot, or
completes its generation increment before we load it.

While this race has become more visible due to recent features using
signal barriers in more places (such as online wal_level changes), the
issue is theoretically present since signal barriers were introduced
to release smgr caches (e.g., in DROP DATABASE). So backpatch to 15.

This issue was also reported by buildfarm animal flaviventris.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2WgAJmWReDN7Chtba8Er2YBvKCoa0KVN25-1evnTrHsLyA@mail.gmail.com
Backpatch-through: 15
---
 src/backend/storage/ipc/procsignal.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 05d99b452c3..a0117ef969b 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -185,6 +185,16 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len)
 	/* Clear out any leftover signal reasons */
 	MemSet(slot->pss_signalFlags, 0, NUM_PROCSIGNALS * sizeof(sig_atomic_t));
 
+	/*
+	 * Publish the PID before reading the global barrier generation to ensure
+	 * that EmitProcSignalBarrier() doesn't skip us while we are grabbing an
+	 * older generation. We need a memory barrier here to make sure that the
+	 * update of pss_pid is globally visible before the load of the global
+	 * barrier generation executes.
+	 */
+	pg_atomic_write_u32(&slot->pss_pid, MyProcPid);
+	pg_memory_barrier();
+
 	/*
 	 * Initialize barrier state. Since we're a brand-new process, there
 	 * shouldn't be any leftover backend-private state that needs to be
@@ -204,7 +214,6 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len)
 	if (cancel_key_len > 0)
 		memcpy(slot->pss_cancel_key, cancel_key, cancel_key_len);
 	slot->pss_cancel_key_len = cancel_key_len;
-	pg_atomic_write_u32(&slot->pss_pid, MyProcPid);
 
 	SpinLockRelease(&slot->pss_mutex);
 
-- 
2.54.0

