Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, tomas(at)vondra(dot)me
Subject: Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
Date: 2025-05-27 04:08:45
Message-ID: CAA4eK1Lmd-stXWrNqX9iXugjY=ahhe7zRfyq+-vN512xvF4G2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 26, 2025 at 10:36 PM Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
>
> On Mon, May 26, 2025 at 2:43 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, May 26, 2025 at 3:52 PM Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> wrote:
>
> > OTOH, if we don't want to adjust physical
> > slot machinery, it seems saving the logical slots to disk immediately
> > when its restart_lsn is updated is a waste of effort after your patch,
> > no? If so, why are we okay with that?
>
> I don't think so. I think the reason why logical slots are synced to
> disk immediately after update is that logical changes are not
> idempotent (you can't safely apply the same change twice) unlike
> physical block-level changes. This is why logical slots need to be
> synced to prevent double replication of same changes, which could
> lead, for example, to double insertion.
>

Hmm, if this has to be true, then even in the else branch of
LogicalConfirmReceivedLocation [1], we should have saved the slot.
AFAIU, whether the logical changes are sent to the client is decided
based on two things: (a) the replication origins, which tracks
replication progress and are maintained by clients (which for built-in
replication are subscriber nodes), see [2]; and (b) confirmed_flush
LSN maintained in the slot by the server. Now, for each ack by the
client after applying/processing changes, we update the
confirmed_flush LSN of the slot but don't immediately flush it. This
shouldn't let us send the changes again because even if the system
crashes and restarts, the client will send the server the location to
start sending the changes from based on its origin tracking. There is
more to it, like there are cases when confirm_flush LSN in the slot
could be ahead the origin's LSN, and we handle all such cases, but I
don't think those are directly related here, so I am skipping those
details for now.

Note that LogicalConfirmReceivedLocation won't save the slot to disk
if it updates only confirmed_flush LSN, which is used to decide
whether to send the changes.

> LogicalConfirmReceivedLocation() implements immediate sync for
> different reasons.
>

I may be missing something, but let's discuss some more before we conclude this.

[1]:
else
{
SpinLockAcquire(&MyReplicationSlot->mutex);

/*
* Prevent moving the confirmed_flush backwards. See comments above
* for the details.
*/
if (lsn > MyReplicationSlot->data.confirmed_flush)
MyReplicationSlot->data.confirmed_flush = lsn;

SpinLockRelease(&MyReplicationSlot->mutex);
}

[2]: https://www.postgresql.org/docs/devel/replication-origins.html

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2025-05-27 05:26:18 Re: Avoiding memory leak when compilation of a function fails
Previous Message Michael Paquier 2025-05-27 03:54:39 Re: Standardize the definition of the subtype field of AlterDomainStmt