Re: persist logical slots to disk during shutdown checkpoint

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: persist logical slots to disk during shutdown checkpoint
Date: 2023-09-05 08:15:28
Message-ID: CAFiTN-sOALGjSYzXSw1K6BnApHtddmF_jQE6i8phA6u=+_oggw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 5, 2023 at 10:12 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Sep 5, 2023 at 7:54 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Monday, September 4, 2023 6:15 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > On Mon, 4 Sept 2023 at 15:20, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Fri, Sep 1, 2023 at 10:50 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Fri, 1 Sept 2023 at 10:06, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > On Thu, Aug 31, 2023 at 6:12 PM Ashutosh Bapat
> > > > > > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > > > > > >
> > > > > > > On Thu, Aug 31, 2023 at 2:52 PM Amit Kapila
> > > <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > > > >
> > > > > > > > All but one. Normally, the idea of marking dirty is to
> > > > > > > > indicate that we will actually write/flush the contents at a
> > > > > > > > later point (except when required for correctness) as even
> > > > > > > > indicated in the comments atop ReplicatioinSlotMarkDirty().
> > > > > > > > However, I see your point that we use that protocol at all the current
> > > places including CreateSlotOnDisk().
> > > > > > > > So, we can probably do it here as well.
> > > > > > >
> > > > > > > yes
> > > > > > >
> > > > > >
> > > > > > I think we should also ensure that slots are not invalidated
> > > > > > (slot.data.invalidated != RS_INVAL_NONE) before marking them dirty
> > > > > > because we don't allow decoding from such slots, so we shouldn't
> > > > > > include those.
> > > > >
> > > > > Added this check.
> > > > >
> > > > > Apart from this I have also fixed the following issues that were
> > > > > agreed on: a) Setting slots to dirty in CheckPointReplicationSlots
> > > > > instead of setting it in SaveSlotToPath
> > > > >
> > > >
> > > > + if (is_shutdown && SlotIsLogical(s)) { SpinLockAcquire(&s->mutex);
> > > > + if (s->data.invalidated == RS_INVAL_NONE &&
> > > > + s->data.confirmed_flush != s->last_saved_confirmed_flush) dirty =
> > > > + s->true;
> > > >
> > > > I think it is better to use ReplicationSlotMarkDirty() as that would
> > > > be consistent with all other usages.
> > >
> > > ReplicationSlotMarkDirty works only on MyReplicationSlot whereas
> > > CheckpointReplicationSlots loops through all the slots and marks the
> > > appropriate slot as dirty, we might have to change ReplicationSlotMarkDirty to
> > > take the slot as input parameter and all caller should pass MyReplicationSlot.
> >
> > Personally, I feel if we want to centralize the code of marking dirty into a
> > function, we can introduce a new static function MarkSlotDirty(slot) to mark
> > passed slot dirty and let ReplicationSlotMarkDirty and
> > CheckpointReplicationSlots call it. Like:
> >
> > void
> > ReplicationSlotMarkDirty(void)
> > {
> > MarkSlotDirty(MyReplicationSlot);
> > }
> >
> > +static void
> > +MarkSlotDirty(ReplicationSlot *slot)
> > +{
> > + Assert(slot != NULL);
> > +
> > + SpinLockAcquire(&slot->mutex);
> > + slot->just_dirtied = true;
> > + slot->dirty = true;
> > + SpinLockRelease(&slot->mutex);
> > +}
> >
> > This is somewhat similar to the relation between ReplicationSlotSave(serialize
> > my backend's replications slot) and SaveSlotToPath(save the passed slot).
> >
> > > Another thing is we have already taken spin lock to access
> > > last_confirmed_flush_lsn from CheckpointReplicationSlots, we could set dirty
> > > flag here itself, else we will have to release the lock and call
> > > ReplicationSlotMarkDirty which will take lock again.
> >
> > Yes, this is unavoidable, but maybe it's not a big problem as
> > we only do it at shutdown.
> >
>
> True but still it doesn't look elegant. I also thought about having a
> probably inline function that marks both just_dirty and dirty fields.
> However, that requires us to assert that the caller has already
> acquired a spinlock. I see a macro SpinLockFree() that might help but
> it didn't seem to be used anywhere in the code so not sure if we can
> rely on it.

Can't we just have code like this? I mean we will have to make
ReplicationSlotMarkDirty take slot as an argument or have another
version which takes slot as an argument and that would be called by us
as well as by ReplicationSlotMarkDirty(). I mean why do we need these
checks (s-(data.invalidated == RS_INVAL_NONE &&
s->data.confirmed_flush != s->last_saved_confirmed_flush) under the
mutex? Walsender is shutdown so confirmed flush LSN can not move
concurrently and slot can not be invalidated as well because that is
done by checkpointer and we are in checkpointer?

+ if (is_shutdown && SlotIsLogical(s))
+ {
+ if (s->data.invalidated == RS_INVAL_NONE &&
+ s->data.confirmed_flush != s->last_saved_confirmed_flush)
+ {
+ ReplicationSlotMarkDirty(s);
+ }
+
+ SpinLockRelease(&s->mutex);
+ }

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-09-05 08:15:55 Re: Improving the heapgetpage function improves performance in common scenarios
Previous Message Thomas Munro 2023-09-05 07:58:10 Re: old_snapshot_threshold bottleneck on replica