Re: persist logical slots to disk during shutdown checkpoint

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: persist logical slots to disk during shutdown checkpoint
Date: 2023-09-11 09:19:49
Message-ID: CAA4eK1K8O-pZkNdpMasqYi_uSNsbNO4SwQXvD9bn_JTgoKtm_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 11, 2023 at 12:08 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Fri, Sep 08, 2023 at 11:50:37AM +0530, Amit Kapila wrote:
> > On Fri, Sep 8, 2023 at 10:08 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >>
> >>
> >> + /*
> >> + * This is used to track the last persisted confirmed_flush LSN value to
> >> + * detect any divergence in the in-memory and on-disk values for the same.
> >> + */
> >>
> >> "This value tracks is the last LSN where this slot's data has been
> >> flushed to disk.
> >>
> >
> > This makes the comment vague as this sounds like we are saving a slot
> > corresponding to some LSN which is not the case. If you prefer this
> > tone then we can instead say: "This value tracks the last
> > confirmed_flush LSN flushed which is used during a checkpoint shutdown
> > to decide if a logical slot's data should be forcibly flushed or not."
>
> Okay, that looks like an improvement over the term "persisted".
>

Changed accordingly.

> >> This is used during a checkpoint shutdown to decide
> >> if a logical slot's data should be forcibly flushed or not."
> >>
> >> Hmm. WAL senders are shut down *after* the checkpointer and *after*
> >> the shutdown checkpoint is finished (see PM_SHUTDOWN and
> >> PM_SHUTDOWN_2) because we want the WAL senders to acknowledge the
> >> checkpoint record before shutting down the primary.
> >>
> >
> > As per my understanding, this is not true for logical walsenders. As
> > per code, while HandleCheckpointerInterrupts(), we call ShutdownXLOG()
> > which sends a signal to walsender to stop and waits for it to stop.
> > And only after that, did it write a shutdown checkpoint WAL record.
> > After getting the InitStopping signal, walsender sets got_STOPPING
> > flag. Then *logical* walsender ensures that it sends all the pending
> > WAL and exits. What you have quoted is probably true for physical
> > walsenders.
>
> Hm, reminding me about this area.. This roots down to the handling of
> WalSndCaughtUp in the send_data callback for logical or physical.
> This is switched to true for logical WAL senders much earlier than
> physical WAL senders, aka before the shutdown checkpoint begins in the
> latter. What was itching me a bit is that the postmaster logic could
> be made more solid. Logical and physical WAL senders are both marked
> as BACKEND_TYPE_WALSND, but we don't actually check that the WAL
> senders remaining at the end of PM_SHUTDOWN_2 are *not* connected to a
> database. This would require a new BACKEND_TYPE_* perhaps, or perhaps
> we're fine with the current state because we'll catch up problems in
> the checkpointer if any WAL is generated while the shutdown checkpoint
> is running anyway. Just something I got in mind, unrelated to this
> patch.
>

I don't know if the difference is worth inventing a new BACKEND_TYPE_*
but if you think so then we can probably discuss this in a new thread.
I think we may want to improve some comments as a separate patch to
make this evident.

>
> + * We can flush dirty replication slots at regular intervals by any
> + * background process like bgwriter but checkpoint is a convenient location.
>
> I still don't quite see a need to mention the bgwriter at all here..
> That's just unrelated.
>

I don't disagree with it, so changed it in the attached patch.

> The comment block in CheckPointReplicationSlots() from v10 uses
> "persist", but you mean "flush", I guess..
>

This point is not very clear to me. Can you please quote the exact
comment if you think something needs to be changed?

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
v11-0001-Persist-logical-slots-to-disk-during-a-shutdown-.patch application/octet-stream 10.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rajendra Kumar Dangwal 2023-09-11 09:23:00 Re: Pgoutput not capturing the generated columns
Previous Message Kuwamura Masaki 2023-09-11 08:49:46 Re: pg_rewind with cascade standby doesn't work well