RE: Fix race in ReplicationSlotRelease for ephemeral slots

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Xuneng Zhou <xunengzhou(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Fix race in ReplicationSlotRelease for ephemeral slots
Date: 2026-06-06 09:35:01
Message-ID: TY4PR01MB1771887D33612C5A45F7E9CDF941E2@TY4PR01MB17718.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, June 5, 2026 8:45 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> On Wed, Jun 3, 2026 at 8:03 PM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 2, 2026 at 3:00 PM Xuneng Zhou <xunengzhou(at)gmail(dot)com>
> wrote:
> >
> > /* Drop the local slot if it is not required to be retained. */
> > if (!local_sync_slot_required(local_slot, remote_slot_list))
> > {
> > + bool dropped = false;
> > + NameData slot_name = {0};
> > + Oid slot_database = local_slot->data.database;
> > bool synced_slot;
> >
> > Is it really safe to read slot_database before acquiring the database lock?
>
> Reading slot_database before taking the database lock seems not
> inherently unsafe by itself. The comment suggests that the lock is
> primarily used to prevent conflicts with the startup process running
> ReplicationSlotsDropDBSlots() during db-drop replay; it does not
> protect replication slot array reuse.
>
> The unsafe part could be reading slot_database from local_slot after
> ReplicationSlotControlLock has been released. At this point, the slot
> array cell may already have been freed and reused, so the value read
> may no longer belong to the slot that get_local_synced_slots()
> originally collected. As a result, we could end up locking the wrong
> database.
>
> There seems to be two related issues:
>
> 1) Before drop: reading local_slot->data.database /
> local_slot->data.name after the slot-array lock was released, before
> verifying the cell still represents the same synced slot.

I recall condition (1) is considered acceptable, since the database lock is
released immediately after re-verifying that the slot is no longer the original
'synced' one anyway. Additionally, this race can only occur when replaying a
DROP DATABASE, which is rare in practice. Since we only take a shared lock, it
does not seem to cause real issues.

>
> 2) After drop: calling ReplicationSlotDropAcquired(false) and then
> reading local_slot->data.database / local_slot->data.name for
> unlocking/logging after the cell may have been reused by another
> backend.

Right. I missed this race condition during implementation and agree it's a real
issue. An even bigger problem is that we could release a lock on the wrong
database if the slot entry is reused after being dropped. I think we should fix
this by saving the database OID at the beginning, as your patch does.

>
> The prior patch targets to fix 2), but leaves 1) unfixed.
>
> > BTW, I'm also wondering whether it's safe for
> > local_sync_slot_required() to access local_slot without holding a lock.
>
> I share the same concern here. The issue smells similar to the one
> discussed above -- reading values from the array cell directly without
> the protection of array lock.
>
> To solve them altogether, it might be
> better to stop carrying raw ReplicationSlot * values across
> unprotected windows. We can get the snapshot values such as slot name
> & database oid from get_local_synced_slots() instead. Then
> local_sync_slot_required() works from the snapshot, and
> drop_local_obsolete_slots() uses the copied database OID to take the
> database lock. Before dropping, it should revalidate by copied
> name/database that the slot is still a synced logical slot, then
> acquire/drop by copied name, and use only copied values for
> unlock/logging. I plan to prepare a refactoring patch for this. Does
> that seem like the right direction to you?

Saving only the name and database OID would force us to search for the slot
again in the loop, which was one reason we didn't implement it that way.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2026-06-06 11:07:11 Re: t/035_standby_logical_decoding.pl might fail on attempt to read wrong timeline
Previous Message Amit Langote 2026-06-06 09:13:15 Re: PG19 FK fast path: OOB write and missed FK checks during batched