| From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "suyu(dot)cmj" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, tomas <tomas(at)vondra(dot)me>, michael <michael(at)paquier(dot)xyz>, "bharath(dot)rupireddyforpostgres" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
| Subject: | Re: Newly created replication slot may be invalidated by checkpoint |
| Date: | 2025-12-04 05:58:03 |
| Message-ID: | CAA4eK1KV_Da=kBSO9TYKmUYWm=RYTOZef00fTtBT3uV+dEr6aA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Dec 4, 2025 at 2:04 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Dec 2, 2025 at 10:15 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > I think the invalidation cannot occur when copying because:
> >
> > Currently, there are no CHECK_FOR_INTERRUPTS() calls between the initial
> > restart_lsn copy (first phase) and the latest restart_lsn copy (second phase).
> > As a result, even if a checkpoint attempts to invalidate a slot and sends a
> > SIGTERM to the backend, the backend will first update the restart_lsn during the
> > second phase before responding to the signal. Consequently, during the next
> > cycle of InvalidatePossiblyObsoleteSlot(), the checkpoint will observe the
> > updated restart_lsn and skip the invalidation.
> >
> > For logical slots, although invoking the output plugin startup callback presents
> > a slight chance of processing the signal (when using third-party plugins), slot
> > invalidation in this scenario results in immediate slot dropping, because the
> > slot is in RS_EPHEMERAL state, thus preventing invalidation.
>
> Thank you for the analysis. I agree.
>
> > While theoretically, slot invalidation could occur if the code changes in the
> > future, addressing that possibility could be considered an independent
> > improvement task. What do you think ?
>
> Okay. I find that it also might make sense for HEAD to use
> RS_EPHEMERAL state for physical slots too to avoid being invalidated
> during creation, which probably can be discussed later. For back
> branches, the proposed idea of acquiring ReplicationSlotAllocationLock
> in an exclusive mode would be better. I think we might want to have a
> comment in CheckPointReplicationSlots() too that refers to
> ReplicationSlotReserveWal().
>
> Regarding whether we revert the original fix 2090edc6f32 and make it
> the same as we did in HEAD ca307d5cec90a4f, we need to change the size
> of ReplicationSlot struct. I'm concerned that it's really safe to
> change it because the data resides on the shared memory. For example,
> we typically iterate over all replication slots as follow:
>
> for (i = 0; i < max_replication_slots; i++)
> {
> ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
>
> I'm concerned that the arithmetic for calculating the slot address is
> affected by the size of ReplicationSlot change.
>
Yes, this is a valid concern. I think we can go-ahead with fixing the
0001's-fix in HEAD and 18. We can discuss separately the fix for
back-branches prior to 18.
--
With Regards,
Amit Kapila.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2025-12-04 06:03:50 | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
| Previous Message | Amit Langote | 2025-12-04 05:54:09 | Re: index prefetching |