Re: [PoC] pg_upgrade: allow to upgrade publisher node

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date: 2023-10-05 08:58:53
Message-ID: CAFiTN-uBArfBEjUFhVioT8RnNhyKOu3jnorWzH_Ogbe5+qpPVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 5, 2023 at 1:48 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Oct 3, 2023 at 9:58 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Fri, Sep 29, 2023 at 5:27 PM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > >
> > > Yeah, the approach enforces developers to check the decodability.
> > > But the benefit seems smaller than required efforts for it because the function
> > > would be used only by pg_upgrade. Could you tell me if you have another use case
> > > in mind? We may able to adopt if we have...
> >
> > I'm attaching 0002 patch (on top of v45) which implements the new
> > decodable callback approach that I have in mind. IMO, this new
> > approach is extensible, better than the current approach (hard-coding
> > of certain WAL records that may be generated during pg_upgrade) taken
> > by the patch, and helps deal with the issue that custom WAL resource
> > managers can have with the current approach taken by the patch.
> >
>
> Today, I discussed this problem with Andres at PGConf NYC and he
> suggested as following. To verify, if there is any pending unexpected
> WAL after shutdown, we can have an API like
> pg_logical_replication_slot_advance() which will simply process
> records without actually sending anything downstream.

So I assume in each lower-level decode function (e.g. heap_decode() )
we will add the check that if we are checking the WAL for an upgrade
then from that level we will return true or false based on whether the
WAL is decodable or not. Is my understanding correct? At first
thought this approach look better and generic.

In this new API,
> we will start with each slot's restart_lsn location and try to process
> till the end of WAL, if we encounter any WAL that needs to be
> processed (like we need to send the decoded WAL downstream) we can
> return a false indicating that there is an unexpected WAL. The reason
> to start with restart_lsn is that it is the location that we use to
> start scanning the WAL anyway.

Yeah, that makes sense.

> Then, we should also try to create slots before invoking pg_resetwal.
> The idea is that we can write a new binary mode function that will do
> exactly what pg_resetwal does to compute the next segment and use that
> location as a new location (restart_lsn) to create the slots in a new
> node. Then, pass it pg_resetwal by using the existing option '-l
> walfile'. As we don't have any API that takes restart_lsn as input, we
> can write a new API probably for binary mode to create slots that do
> take restart_lsn as input. This will ensure that there is no new WAL
> inserted by background processes between resetwal and the creation of
> slots.

Yeah, that looks cleaner IMHO.

> The other potential problem Andres pointed out is that during shutdown
> if due to some reason, the walreceiver goes down, we won't be able to
> send the required WAL and users won't be able to ensure that because
> even after restart the same situation can happen. The ideal way is to
> have something that puts the system in READ ONLY state during shutdown
> and then we can probably allow walreceivers to reconnect and receive
> the required WALs. As we don't have such functionality available and
> it won't be easy to achieve the same, we can leave this for now.

+1

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-10-05 09:13:54 Re: pgsql: Some refactoring to export json(b) conversion functions
Previous Message Ajin Cherian 2023-10-05 08:54:14 Re: Synchronizing slots from primary to standby