Re: pg_upgrade and logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_upgrade and logical replication
Date: 2023-03-09 06:35:36
Message-ID: CAA4eK1+pWXqDSO+QpGtLFqcCP5+qCXkvLN-YviRqdXV+d7Fdow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 8, 2023 at 12:26 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Sat, 4 Mar 2023, 14:13 Amit Kapila, <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>>
>> > For the publisher nodes, that may be something nice to support (I'm assuming it
>> > could be useful for more complex replication setups) but I'm not interested in
>> > that at the moment as my goal is to reduce downtime for major upgrade of
>> > physical replica, thus *not* doing pg_upgrade of the primary node, whether
>> > physical or logical. I don't see why it couldn't be done later on, if/when
>> > someone has a use case for it.
>> >
>>
>> I thought there is value if we provide a way to upgrade both publisher
>> and subscriber.
>
>
> it's still unclear to me whether it's actually achievable on the publisher side, as running pg_upgrade leaves a "hole" in the WAL stream and resets the timeline, among other possible difficulties. Now I don't know much about logical replication internals so I'm clearly not the best person to answer those questions.
>

I think that is the part we need to analyze and see what are the
challenges there. One part of the challenge is that we need to
preserve slots that have some WAL locations like restart_lsn,
confirmed_flush and we need WAL from those locations for decoding. I
haven't analyzed this but isn't it possible to that on clean shutdown
we confirm that all the WAL has been sent and confirmed by the logical
subscriber in which case I think truncating WAL in pg_upgrade
shouldn't be a problem?

>> Now, you came up with a use case linking it to a
>> physical replica where allowing an upgrade of only subscriber nodes is
>> useful. It is possible that users find your steps easy to perform and
>> didn't find them error-prone but it may be better to get some
>> authentication of the same. I haven't yet analyzed all the steps in
>> detail but let's see what others think.
>
>
> It's been quite some time since and no one seemed to chime in or object. IMO doing a major version upgrade with limited downtime (so something faster than stopping postgres and running pg_upgrade) has always been difficult and never prevented anyone from doing it, so I don't think that it should be a blocker for what I'm suggesting here, especially since the current behavior of pg_upgrade on a subscriber node is IMHO broken.
>
> Is there something that can be done for pg16? I was thinking that having a fix for the normal and easy case could be acceptable: only allowing pg_upgrade to optionally, and not by default, preserve the subscription relations IFF all subscriptions only have tables in ready state. Different states should be transient, and it's easy to check as a user beforehand and also easy to check during pg_upgrade, so it seems like an acceptable limitations (which I personally see as a good sanity check, but YMMV). It could be lifted in later releases if wanted anyway.
>
> It's unclear to me whether this limited scope would also require to preserve the replication origins, but having looked at the code I don't think it would be much of a problem as the local LSN doesn't have to be preserved.
>

I think we need to preserve replication origins as they help us to
determine the WAL location from where to start the streaming after the
upgrade. If we don't preserve those then from which location will the
subscriber start streaming? We don't want to replicate the WAL which
has already been sent.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-03-09 06:37:21 Re: Add pg_walinspect function with block info columns
Previous Message Michael Paquier 2023-03-09 06:35:26 Re: [PoC] Let libpq reject unexpected authentication requests