Re: persist logical slots to disk during shutdown checkpoint

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: persist logical slots to disk during shutdown checkpoint
Date: 2023-09-05 03:39:43
Message-ID: CAFiTN-uUJpy8tHtmEAZF1EpimD_vY-psi9vQKhU=w=KuZXPhaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 1, 2023 at 12:16 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Fri, 1 Sept 2023 at 10:06, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Aug 31, 2023 at 6:12 PM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Aug 31, 2023 at 2:52 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > All but one. Normally, the idea of marking dirty is to indicate that
> > > > we will actually write/flush the contents at a later point (except
> > > > when required for correctness) as even indicated in the comments atop
> > > > ReplicatioinSlotMarkDirty(). However, I see your point that we use
> > > > that protocol at all the current places including CreateSlotOnDisk().
> > > > So, we can probably do it here as well.
> > >
> > > yes
> > >
> >
> > I think we should also ensure that slots are not invalidated
> > (slot.data.invalidated != RS_INVAL_NONE) before marking them dirty
> > because we don't allow decoding from such slots, so we shouldn't
> > include those.
>
> Added this check.
>
> Apart from this I have also fixed the following issues that were
> agreed on: a) Setting slots to dirty in CheckPointReplicationSlots
> instead of setting it in SaveSlotToPath b) The comments were moved
> from ReplicationSlot and moved to CheckPointReplicationSlots c) Tests
> will be run in autovacuum = off d) updating last_saved_confirmed_flush
> based on cp.slotdata.confirmed_flush rather than
> slot->data.confirmed_flush.
> I have also added the commitfest entry for this at [1].

The overall idea looks fine to me

+
+ /*
+ * We won't ensure that the slot is persisted after the
+ * confirmed_flush LSN is updated as that could lead to frequent
+ * writes. However, we need to ensure that we do persist the slots at
+ * the time of shutdown whose confirmed_flush LSN is changed since we
+ * last saved the slot to disk. This will help in avoiding retreat of
+ * the confirmed_flush LSN after restart.
+ */
+ if (is_shutdown && SlotIsLogical(s))
+ {
+ SpinLockAcquire(&s->mutex);
+ if (s->data.invalidated == RS_INVAL_NONE &&
+ s->data.confirmed_flush != s->last_saved_confirmed_flush)
+ s->dirty = true;
+ SpinLockRelease(&s->mutex);
+ }

The comments don't mention anything about why we are just flushing at
the shutdown checkpoint. I mean the checkpoint is not that frequent
and we already perform a lot of I/O during checkpoints so isn't it
wise to flush during every checkpoint. We may argue that there is no
extra advantage of that as we are not optimizing for crash recovery
but OTOH there is no reason for not doing so for other checkpoints or
we are worried about the concurrency with parallel walsender running
during non shutdown checkpoint if so then better we explain that as
well? If it is already discussed in the thread and we have a
conclusion on this then maybe we can mention this in comments?

/*
* Flush all replication slots to disk.
*
- * This needn't actually be part of a checkpoint, but it's a convenient
- * location.
+ * is_shutdown is true in case of a shutdown checkpoint.
*/
void
-CheckPointReplicationSlots(void)
+CheckPointReplicationSlots(bool is_shutdown)

It seems we have removed two lines from the function header comments,
is this intentional or accidental?

Other than this patch LGTM.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2023-09-05 03:49:33 Re: proposal: psql: show current user in prompt
Previous Message Noah Misch 2023-09-05 03:36:05 Re: Optionally using a better backtrace library?