Re: Newly created replication slot may be invalidated by checkpoint

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>
Cc: "suyu(dot)cmj" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, aekorotkov <aekorotkov(at)gmail(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, "bharath(dot)rupireddyforpostgres" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Newly created replication slot may be invalidated by checkpoint
Date: 2025-09-23 06:49:54
Message-ID: CAA4eK1+wrNSee6PKQ0+DtUu_W0GdvewskpAEK5EiX6r3E+2Sxw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 17, 2025 at 4:19 PM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
>
> [1] 0001-Fix-invalidation-when-slot-is-created-during-checkpo.patch
>

- /* Calculate how many segments are kept by slots. */
- keep = slotsMinReqLSN;
+ /*
+ * Calculate how many segments are kept by slots. Keep the wal using
+ * the minimal value from the current reserved LSN and the reserved LSN at
+ * the moment of checkpoint start (before CheckPointReplicationSlots).
+ */
+ keep = XLogGetReplicationSlotMinimumLSN();
+ if (!XLogRecPtrIsInvalid(slotsMinReqLSN))
+ keep = Min(keep, slotsMinReqLSN);

Can we add why we need this additional calculation here?

I have one question regarding commit 2090edc6f32f652a2c:
*
if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
_logSegNo, InvalidOid,
InvalidTransactionId))
{
+ /*
+ * Recalculate the current minimum LSN to be used in the WAL segment
+ * cleanup. Then, we must synchronize the replication slots again in
+ * order to make this LSN safe to use.
+ */
+ slotsMinReqLSN = XLogGetReplicationSlotMinimumLSN();
+ CheckPointReplicationSlots(shutdown);
+
/*
* Some slots have been invalidated; recalculate the old-segment
* horizon, starting again from RedoRecPtr.
*/
XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
- KeepLogSeg(recptr, &_logSegNo);
+ KeepLogSeg(recptr, slotsMinReqLSN, &_logSegNo);

After invalidating the slots, we recalculate the slotsMinReqLSN with
the latest value of XLogGetReplicationSlotMinimumLSN(). Can't it
generate a more recent value of slot's restart_lsn which has not been
flushed and we may end up removing the corresponding WAL? We should
probably add some comments as to why such a race doesn't exist.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-09-23 07:06:10 Re: Newly created replication slot may be invalidated by checkpoint
Previous Message Bertrand Drouvot 2025-09-23 06:44:31 Re: Report bytes and transactions actually sent downtream