Re: pg_upgrade and logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_upgrade and logical replication
Date: 2023-11-23 08:43:45
Message-ID: CALDaNm3wyYY5ywFpCwUVW1_Di1af3WxeZggGEDQEu8qa58a7FQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 21 Nov 2023 at 07:11, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Nov 20, 2023 at 09:49:41AM +0530, Amit Kapila wrote:
> > On Tue, Nov 14, 2023 at 7:21 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >> There are couple of things happening here: a) In the first part we
> >> take care of setting subscription relation to SYNCDONE and dropping
> >> the replication slot at publisher node, only if drop replication slot
> >> is successful the relation state will be set to SYNCDONE , if drop
> >> replication slot fails the relation state will still be in
> >> FINISHEDCOPY. So if there is a failure in the drop replication slot we
> >> will not have an issue as the tablesync worker will be in
> >> FINISHEDCOPYstate and this state is not allowed for upgrade. When the
> >> state is in SYNCDONE the tablesync slot will not be present. b) In the
> >> second part we drop the replication origin, even if there is a chance
> >> that drop replication origin fails due to some reason, there will be
> >> no problem as we do not copy the table sync replication origin to the
> >> new cluster while upgrading. Since the table sync replication origin
> >> is not copied to the new cluster there will be no replication origin
> >> leaks.
> >
> > And, this will work because in the SYNCDONE state, while removing the
> > origin, we are okay with missing origins. It seems not copying the
> > origin for tablesync workers in this state (SYNCDONE) relies on the
> > fact that currently, we don't use those origins once the system
> > reaches the SYNCDONE state but I am not sure it is a good idea to have
> > such a dependency and that upgrade assuming such things doesn't seems
> > ideal to me.
>
> Hmm, yeah, you mean the replorigin_drop_by_name() calls in
> tablesync.c. I did not pay much attention about that in the code, but
> your point sounds sensible.
>
> (I have not been able to complete an analysis of the risks behind 's'
> to convince myself that it is entirely safe, but leaks are scary as
> hell if this gets automated across a large fleet of nodes..)
>
> > Personally, I think allowing an upgrade in 'i'
> > (initialize) state or 'r' (ready) state seems safe because in those
> > states either slots/origins don't exist or are dropped. What do you
> > think?
>
> I share a similar impression about 's'. From a design point of view,
> making the conditions to reach harder in the first implementation
> makes the user experience stricter, but that's safer regarding leaks
> and it is still possible to relax these choices in the future
> depending on the improvement pieces we are able to figure out.

Based on the suggestions just to have safe init and ready state, I
have made the changes to handle the same in v18 version patch
attached.

Regards,
Vignesh

Attachment Content-Type Size
v18-0001-Preserve-the-full-subscription-s-state-during-pg.patch text/x-patch 48.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-11-23 08:48:52 Re: pg_upgrade and logical replication
Previous Message Maxim Orlov 2023-11-23 08:43:41 Re: How to stop autovacuum silently