| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "suyu(dot)cmj" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, "bharath(dot)rupireddyforpostgres" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
| Subject: | Re: Newly created replication slot may be invalidated by checkpoint |
| Date: | 2025-12-02 16:26:18 |
| Message-ID: | CAD21AoCDBfCCP7w_S+YnfW7LKNmdEAmY3gC-XP_vGYbiRRQRRQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Dec 1, 2025 at 10:19 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, December 2, 2025 1:03 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 21, 2025 at 12:14 AM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > OK, I think it makes sense to start separate threads.
> > >
> > > I have split the patches based on the different bugs they
> > > address and am sharing them here for reference.
> > >
> >
> > I'm reviewing the 0001 patch and the problem that can be addressed by
> > that patch. While the proposed patch addresses the race condition
> > between a checkpointing and newly created slot, could the same issue
> > happen between the checkpointing and copying a slot? I'm trying to
> > understand when we have to acquire ReplicationSlotAllocationLock in an
> > exclusive mode in the new lock scheme.
>
> Thanks for reviewing !
>
> I think the situation is somewhat different in the copy_replication_slot(). As
> noted in the comments[1], it's considered acceptable for WALs preceding the
> initial restart_lsn to be removed since the latest restart_lsn will be copied
> again in the second phase, so latest WAL being reserved is safe.
Right. But does it mean that the new slot could be invalidated while
being copied if the first copied restart_lsn becomes less than a new
redo ptr set by a concurrent checkpoint? I thought the problem the
0001 patch is trying to fix is that the slot could end up being
invalidated by a concurrent checkpoint even while being created, so I
wonder if the same problem could occur.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2025-12-02 16:31:33 | Re: show size of DSAs and dshash tables in pg_dsm_registry_allocations |
| Previous Message | Mihail Nikalayeu | 2025-12-02 16:22:47 | Re: Adding REPACK [concurrently] |