Re: Replication slot stats misgivings

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication slot stats misgivings
Date: 2021-05-04 05:34:23
Message-ID: CALDaNm0UubKbW4xLLrOhEiXj9qxJvn1VsWQpa_MhgxG4nqK50g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 4, 2021 at 9:48 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, May 3, 2021 at 10:21 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, May 3, 2021 at 5:48 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Mon, May 3, 2021 at 2:27 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Apr 29, 2021 at 10:37 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Wed, Apr 28, 2021 at 7:43 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > On Wed, Apr 28, 2021 at 3:25 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > > >
> > > > > >
> > > > > > > I am not sure if any of these alternatives are a good idea. What do
> > > > > > > you think? Do you have any other ideas for this?
> > > > > >
> > > > > > I've been considering some ideas but don't come up with a good one
> > > > > > yet. It’s just an idea and not tested but how about having
> > > > > > CreateDecodingContext() register before_shmem_exit() callback with the
> > > > > > decoding context to ensure that we send slot stats even on
> > > > > > interruption. And FreeDecodingContext() cancels the callback.
> > > > > >
> > > > >
> > > > > Is it a good idea to send stats while exiting and rely on the same? I
> > > > > think before_shmem_exit is mostly used for the cleanup purpose so not
> > > > > sure if we can rely on it for this purpose. I think we can't be sure
> > > > > that in all cases we will send all stats, so maybe Vignesh's patch is
> > > > > sufficient to cover the cases where we avoid losing it in cases where
> > > > > we would have sent a large amount of data.
> > > > >
> > > >
> > > > Sawada-San, any thoughts on this point?
> > >
> > > before_shmem_exit is mostly used to the cleanup purpose but how about
> > > on_shmem_exit()? pgstats relies on that to send stats at the
> > > interruption. See pgstat_shutdown_hook().
> > >
> >
> > Yeah, that is worth trying. Would you like to give it a try?
>
> Yes.
>
> In this approach, I think we will need to have a static pointer in
> logical.c pointing to LogicalDecodingContext that we’re using. At
> StartupDecodingContext(), we set the pointer to the just created
> LogicalDecodingContext and register the callback so that we can refer
> to the LogicalDecodingContext on that callback. And at
> FreeDecodingContext(), we reset the pointer to NULL (however, since
> FreeDecodingContext() is not called when an error happens we would
> need to ensure resetting it somehow). But, after more thought, if we
> have the static pointer in logical.c it would rather be better to have
> a global function that sends slot stats based on the
> LogicalDecodingContext pointed by the static pointer and can be called
> by ReplicationSlotRelease(). That way, we don’t need to worry about
> erroring out cases as well as interruption cases, although we need to
> have a new static pointer.
>
> I've attached a quick-hacked patch. I also incorporated the change
> that calls UpdateDecodingStats() at FreeDecodingContext() so that we
> can send slot stats also in the case where we spilled/streamed changes
> but finished without commit/abort/prepare record.
>
> > I think
> > it still might not cover the cases where we error out in the backend
> > while decoding via APIs because at that time we won't exit, maybe for
> > that we can consider Vignesh's patch.
>
> Agreed. It seems to me that the approach of the attached patch is
> better than the approach using on_shmem_exit(). So if we want to avoid
> having the new static pointer and function for this purpose we can
> consider Vignesh’s patch.
>

I'm ok with using either my patch or Sawada san's patch, Even I had
the same thought of whether we should have a static variable thought
pointed out by Sawada san. Apart from that I had one minor comment:
This comment needs to be corrected "andu sed to sent"
+/*
+ * Pointing to the currently-used logical decoding context andu sed to sent
+ * slot statistics on releasing slots.
+ */
+static LogicalDecodingContext *MyLogicalDecodingContext = NULL;
+

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2021-05-04 06:58:36 Re: Incorrect snapshots while promoting hot standby node when 2PC is used
Previous Message Peter Smith 2021-05-04 05:13:24 Re: AlterSubscription_refresh "wrconn" wrong variable?