Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-09-28 07:36:37
Message-ID: CAA4eK1J12EEvQmV4MQV1QgZ0Fs-0tw7Tw-FDRQ_5Zaym7dKvwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 28, 2023 at 10:44 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Mon, Sep 25, 2023 at 2:06 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > > [1] https://www.postgresql.org/message-id/CAA4eK1%2BLtWDKXvxS7gnJ562VX%2Bs3C6%2B0uQWamqu%3DUuD8hMfORg%40mail.gmail.com
> > >
> > > I see. IIUC, without that commit e0b2eed [1], it may happen that the
> > > slot's on-disk confirmed_flush LSN value can be higher than the WAL
> > > LSN that's flushed to disk, no?
> > >
> >
> > No, without that commit, there is a very high possibility that even if
> > we have sent the WAL to the subscriber and got the acknowledgment of
> > the same, we would miss updating it before shutdown. This would lead
> > to upgrade failures because upgrades have no way to later identify
> > whether the remaining WAL records are sent to the subscriber.
>
> Thanks for clarifying. I'm trying understand what happens without
> commit e0b2eed0 with an illustration:
>
> step 1: publisher - confirmed_flush LSN in replication slot on disk
> structure is 80
> step 2: publisher - sends WAL at LSN 100
> step 3: subscriber - acknowledges the apply LSN or confirmed_flush LSN as 100
> step 4: publisher - shuts down without writing the new confirmed_flush
> LSN as 100 to disk, note that commit e0b2eed0 is not in place
> step 5: publisher - restarts
> step 6: subscriber - upon publisher restart, the subscriber requests
> WAL from publisher from LSN 100 as it tracks the last applied LSN in
> replication origin
>
> Now, if the pg_upgrade with the patch in this thread is run on
> publisher after step 4, it complains with "The slot \"%s\" has not
> consumed the WAL yet".
>
> Is my above understanding right?
>

Yes.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jacktby jacktby 2023-09-28 07:38:28 Re: Set enable_seqscan doesn't take effect?
Previous Message Fujii.Yuki@df.MitsubishiElectric.co.jp 2023-09-28 07:21:10 RE: Partial aggregates pushdown