Re: Newly created replication slot may be invalidated by checkpoint

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "suyu(dot)cmj" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, "bharath(dot)rupireddyforpostgres" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Newly created replication slot may be invalidated by checkpoint
Date: 2025-11-10 09:11:40
Message-ID: CAA4eK1KaAmVpiOgFADDfL3EOh8yztOqbUbxm4FzMKnd-xpijpQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 5, 2025 at 3:48 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> On Mon, Oct 6, 2025 at 6:46 PM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
> > There is one subtle thing. Once, the operation of restart_lsn assignment is not
> > an atomic, the following scenario may happen theoretically:
> > 1. Read GetRedoRecPtr() in the backend (ReplicationSlotReserveWal)
> > 2. Assign a new redo LSN in the checkpointer
> > 3. Call ReplicationSlotsComputeRequiredLSN() in the checkpointer
> > 3. Assign the old redo LSN to restart_lsn
> >
> > In this scenario, the restart_lsn will point to a previous redo LSN and it will
> > be not protected by the new redo LSN. This scenario is unlikely, but it can
> > happen theoretically. I have no ideas how to deal with it, except of assigning
> > restart_lsn under XLogCtl->info_lck lock to avoid concurrent modification of
> > XLogCtl->RecoRecPtr until it is assigned to restart_lsn of a creating slot.
> >
> > In case of recovery, when GetXLogReplayRecPtr() is used, the protection by
> > redo LSN seems to work as well, because a new redo LSN is taken from the latest
> > replayed checkpoint. Thus, it is guaranteed that GetXLogReplayRecPtr() will not
> > be less than the new redo LSN, if it is called right after assignment of redo
> > LSN in CreateRestartPoint().
>
> Thank you for highlighting this scenario. I've reviewed it. I think
> we could avoid it by covering appropriate parts of
> ReplicationSlotReserveWal() and Create{Check|Restart}Point() by a new
> LWLock. The draft patch is attached. What do you think?
>

The fix seems to be only provided for bank branches, but IIUC the
problem can happen in HEAD as well. In Head, how about acquiring
ReplicationSlotAllocationLock in Exclusive mode during
ReplicationSlotReserveWal? This lock is acquired in SHARE mode in
CheckPointReplicationSlots. So, this should make our calculations
correct and avoid invalidating the newly created slot.

I feel with the proposed patches for back branches, the code is
deviating too much and also makes it a bit complicated, which means it
could be difficult to maintain it in the future. Can we consider
reverting the original fix 2090edc6f32f652a2c995ca5f7e65748ae1e4c5d
and make it the same as we did in HEAD
ca307d5cec90a4fde62a50fafc8ab607ff1d8664? I know this would lead to
ABI breakage, but only for extensions using sizeof(ReplicationSlot),
if any. We can try to identify how many extensions rely on
sizeof(ReplicationSlot) and then decide accordingly? We recently did
something similar for another backbranch fix [1] which requires adding
members at the end of structure.

OTOH, if we really want to go in the direction of deviating the back
branch code further, then we can review your fix, but I am hesitant to
go in that direction due to additional complexity and maintenance
burden.

[1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=45c357e0e85d2dffe7af5440806150124a725a01

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2025-11-10 09:21:29 Re: ON CONFLICT DO SELECT (take 3)
Previous Message Chao Li 2025-11-10 09:06:07 Re: [PATCH] Add hints for invalid binary encoding names in encode/decode functions