Re: pg_upgrade and logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_upgrade and logical replication
Date: 2023-09-28 04:47:35
Message-ID: CALDaNm0FO1o5skL8Y1sMPoSta-ecocsy0AA1GemLQmPqZ3S4YQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 25 Sept 2023 at 10:05, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Sep 22, 2023 at 4:36 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >
> > On Thu, Sep 21, 2023 at 02:35:55PM +0530, Amit Kapila wrote:
> > > It is because after upgrade of both publisher and subscriber, the
> > > subscriptions won't work. Both publisher and subscriber should work,
> > > otherwise, the logical replication set up won't work. I think we can
> > > probably do this, if we can document clearly how the user can make
> > > their logical replication set up work after upgrade.
> >
> > Yeah, well, this comes back to my original point that the upgrade of
> > publisher nodes and subscriber nodes should be treated as two
> > different problems or we're mixing apples and oranges (and a node
> > could have both subscriber and publishers). While being able to
> > support both is a must, it is going to be a two-step process at the
> > end, with the subscribers done first and the publishers done after.
> > That's also kind of the point that Julien makes in top message of this
> > thread.
> >
> > I agree that docs are lacking in the proposed patch in terms of
> > restrictions, assumptions and process flow, but taken in isolation the
> > problem of the publishers is not something that this patch has to take
> > care of.
> >
>
> I also don't think that this patch has to solve the problem of
> publishers in any way but as per my understanding, if due to some
> reason we are not able to do the upgrade of publishers, this can add
> more steps for users than they have to do now for logical replication
> set up after upgrade. This is because now after restoring the
> subscription rel's and origin, as soon as we start replication after
> creating the slots on the publisher, we will never be able to
> guarantee data consistency. So, they need to drop the entire
> subscription setup including truncating the relations, and then set it
> up from scratch which also means they need to somehow remember or take
> a dump of the current subscription setup. According to me, the key
> point is to have a mechanism to set up slots correctly to allow
> replication (or subscriptions) to work after the upgrade. Without
> that, it appears to me that we are restoring a subscription where it
> can start from some random LSN and can easily lead to data consistency
> issues where it can miss some of the updates.
>
> This is the primary reason why I prioritized to work on the publisher
> side before getting this patch done, otherwise, the solution for this
> patch was relatively clear. I am not sure but I guess this could be
> the reason why originally we left it in the current state, otherwise,
> restoring subscription rel's or origin doesn't seem to be too much of
> an additional effort than what we are doing now.

I have tried to analyze the steps for upgrading the subscriber with
HEAD and with the upgrade patches, Here are the steps for the same:
Current steps to upgrade subscriber in HEAD:
1) Upgrade the subscriber server
2) Start subscriber server
3) truncate the tables
4) Alter the subscriptions to point to new slots in the subscriber
5) Enable the subscriptions
6) Alter subscription to refresh the publications

Steps to upgrade If we commit only the subscriber upgrade patch:
1) Upgrade the subscriber server
2) Start subscriber server
3) truncate the tables
Note: We will have to drop the subscriptions as we have made changes
to the pg_subscription_rel
4) But drop subscription will throw error:
postgres=# DROP SUBSCRIPTION test1 cascade;
ERROR: could not drop replication slot "test1" on publisher: ERROR:
replication slot "test1" does not exist
5) Alter the subscription to set slot_name to none
6) Make a note of all the subscriptions that are present
7) drop the subscriptions
8) Create the subscriptions

The number of steps will increase in this case.

Steps to upgrade If we commit publisher upgrade patch first and then
the subscriber upgrade patch patch:
1) Upgrade the subscriber server
2) Start subscriber server
3) Enable the subscription
4) Alter subscription to refresh the publications

Based on the above, I also feel it is better to get the upgrade
publisher patch committed first, as a) it will reduce the data copying
time(as truncate is not required) b) the number of steps will reduce
c) all the use cases will be handled.

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-09-28 04:51:38 Re: Synchronizing slots from primary to standby
Previous Message Fujii.Yuki@df.MitsubishiElectric.co.jp 2023-09-28 04:40:40 RE: Partial aggregates pushdown