Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-09-28 05:14:06
Message-ID: CALj2ACXptaLCQW1qgjqwLHZWkeej+v8JBkT=RHfYs24fSZ1EZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 25, 2023 at 2:06 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > > [1] https://www.postgresql.org/message-id/CAA4eK1%2BLtWDKXvxS7gnJ562VX%2Bs3C6%2B0uQWamqu%3DUuD8hMfORg%40mail.gmail.com
> >
> > I see. IIUC, without that commit e0b2eed [1], it may happen that the
> > slot's on-disk confirmed_flush LSN value can be higher than the WAL
> > LSN that's flushed to disk, no?
> >
>
> No, without that commit, there is a very high possibility that even if
> we have sent the WAL to the subscriber and got the acknowledgment of
> the same, we would miss updating it before shutdown. This would lead
> to upgrade failures because upgrades have no way to later identify
> whether the remaining WAL records are sent to the subscriber.

Thanks for clarifying. I'm trying understand what happens without
commit e0b2eed0 with an illustration:

step 1: publisher - confirmed_flush LSN in replication slot on disk
structure is 80
step 2: publisher - sends WAL at LSN 100
step 3: subscriber - acknowledges the apply LSN or confirmed_flush LSN as 100
step 4: publisher - shuts down without writing the new confirmed_flush
LSN as 100 to disk, note that commit e0b2eed0 is not in place
step 5: publisher - restarts
step 6: subscriber - upon publisher restart, the subscriber requests
WAL from publisher from LSN 100 as it tracks the last applied LSN in
replication origin

Now, if the pg_upgrade with the patch in this thread is run on
publisher after step 4, it complains with "The slot \"%s\" has not
consumed the WAL yet".

Is my above understanding right?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2023-09-28 05:58:28 Re: [PGdocs] fix description for handling pf non-ASCII characters
Previous Message Amit Kapila 2023-09-28 04:51:38 Re: Synchronizing slots from primary to standby