| From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
|---|---|
| To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
| Cc: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "suyu(dot)cmj" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, "bharath(dot)rupireddyforpostgres" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
| Subject: | Re: Newly created replication slot may be invalidated by checkpoint |
| Date: | 2025-11-13 04:18:57 |
| Message-ID: | CAA4eK1+dEWiTS-fRkTsBSvfetPXaom4JEYSqqXAq-cFiL7rBdg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Nov 13, 2025 at 8:03 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, November 12, 2025 11:55 PMVitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
> >
> > On Monday, November 10, 2025 12:11 MSK, Amit Kapila
> > <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > The fix seems to be only provided for bank branches, but IIUC the
> > > problem can happen in HEAD as well. In Head, how about acquiring
> > > ReplicationSlotAllocationLock in Exclusive mode during
> > > ReplicationSlotReserveWal? This lock is acquired in SHARE mode in
> > > CheckPointReplicationSlots. So, this should make our calculations
> > > correct and avoid invalidating the newly created slot.
> >
> > Amit, I'm not sure how it may help to avoid the change of redo rec ptr during
> > wal reservation by a slot, because it doesn't serialize redo rec ptr assignment
> > and slot's wal reservation, but the core issue is in reco rec ptr change when we
> > reserve the wal by a slot.
>
> I think the main issue here lies in the possibility that the minimum restart_lsn
> obtained during a checkpoint could be less than the WAL position that is being
> reserved concurrently. So, instead of serializing the redo assignment and WAL
> reservation, Amit proposes serializing the CheckPointReplicationSlots() and WAL
> reservation. This would ensure the following:
>
> 1) If the WAL reservation occurs first, the checkpoint must wait for the
> restart_lsn to be updated before proceeding with WAL removal. This guarantees
> that the most recent restart_lsn position is detected.
>
> 2) If the checkpoint calls CheckPointReplicationSlots() first, then any
> subsequent WAL reservation must take a position later than the redo pointer.
>
Your understanding is correct. This whole difference of the way same
problem can happen in HEAD and back branches but in different ways
because the code varies a lot between branches. Now, if we again put a
different fix for the newly discovered problem, then it will take the
code much further away adding more to the maintenance burden. So, I
thought we should once discuss making the code same in HEAD and back
branches. Can we do some research to see if any of the available
extensions uses sizeof(ReplicationSlot)? I think even if we don't find
any such extension there will always be a chance that there is some
closed source extension that might rely on it but I think still this
is worth considering to change ABI in back branches for this fix. This
code area is quite subtle where the bugs exist from a long time and
the initial fix also didn't close all holes.
--
With Regards,
Amit Kapila.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Quan Zongliang | 2025-11-13 04:48:21 | Re: [PATCH] Add pg_get_database_ddl() function to reconstruct CREATE DATABASE statement |
| Previous Message | Quan Zongliang | 2025-11-13 04:17:35 | Re: [PATCH] Add pg_get_database_ddl() function to reconstruct CREATE DATABASE statement |