Re: persist logical slots to disk during shutdown checkpoint

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: persist logical slots to disk during shutdown checkpoint
Date: 2023-09-01 09:15:48
Message-ID: CAA4eK1+pjnHssjPrk=Z2CHkWZb3VwKeL7ySwT120qNEmHDf-aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 1, 2023 at 1:11 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Thu, Aug 31, 2023 at 7:28 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Aug 31, 2023 at 6:12 PM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > > > + qr/Streaming transactions committing after ([A-F0-9]+\/[A-F0-9]+),
> > > > reading WAL from ([A-F0-9]+\/[A-F0-9]+)./
> > >
> > > I don't think the LSN reported in this message is guaranteed to be the
> > > confirmed_flush LSN of the slot. It's usually confirmed_flush but not
> > > always. It's the LSN that snapshot builder computes based on factors
> > > including confirmed_flush. There's a chance that this test will fail
> > > sometimes because of this behaviour.
> > >
> >
> > I think I am missing something here because as per my understanding,
> > the LOG referred by the test is generated in CreateDecodingContext()
> > before which we shouldn't be changing the slot's confirmed_flush LSN.
> > The LOG [1] refers to the slot's persistent value for confirmed_flush,
> > so how it could be different from what the test is expecting.
> >
> > [1]
> > errdetail("Streaming transactions committing after %X/%X, reading WAL
> > from %X/%X.",
> > LSN_FORMAT_ARGS(slot->data.confirmed_flush),
>
> I was afraid that we may move confirmed_flush while creating the
> snapshot builder when creating the decoding context. But I don't see
> any code doing that. So may be we are safe.
>

We are safe in that respect. As far as I understand there is no reason
to be worried.

>
But if the log message
> changes, this test would fail - depending upon the log message looks a
> bit fragile, esp. when we have a way to access the data directly
> reliably.
>

This message is there from the very begining (b89e1510) and I can't
forsee a reason to change such a message. But even if we change, we
can always change the test output or test accordingly, if required. I
think it is a matter of preference to which way we can write the test,
so let's not argue too much on this. I find current way slightly more
reliable but we can change it if we see any problem.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2023-09-01 09:23:07 Re: PATCH: Add REINDEX tag to event triggers
Previous Message David Geier 2023-09-01 08:47:10 Re: Eliminate redundant tuple visibility check in vacuum