Re: pg_upgrade and logical replication

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_upgrade and logical replication
Date: 2023-02-17 15:35:01
Message-ID: 20230217153501.jx4klo2mezpqpada@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Feb 17, 2023 at 04:12:54PM +0530, Amit Kapila wrote:
> On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> > An easy workaround that I tried is to allow something like
> >
> > ALTER SUBSCRIPTION ... ENABLE WITH (refresh = true, copy_data = false)
> >
> > so that the refresh internally happens before the apply worker is started and
> > you just keep consuming the delta, which works on naive scenario.
> >
> > One concern I have with this approach is that the default values for both
> > "refresh" and "copy_data" for all other subcommands is "true, but we would
> > probably need a different default value in that exact scenario (as we know we
> > already have the data). I think that it would otherwise be safe in my very
> > specific scenario, assuming that you created the slot beforehand and moved the
> > slot's LSN at the promotion point, as even if you add non-empty tables to the
> > publication you will only need the delta whether those were initially empty or
> > not given your initial physical replica state.
> >
>
> This point is not very clear. Why would one just need delta even for new tables?

Because in my scenario I'm coming from physical replication, so I know that I
did replicate everything until the promotion LSN. Any table later added in the
publication is either already fully replicated until that LSN on the upgraded
node, so only the delta is needed, or has been created after that LSN. In the
latter case, the entirety of the table will be replicated with the logical
replication as a delta right?

> > Any other scenario would make
> > this new option dangerous, if not entirely useless, but not more than any of
> > the current commands that lead to refreshing a subscription and have the same
> > options I guess.
> >
> > All in all, currently the only way to somewhat safely resume logical
> > replication after a pg_upgrade is to drop all the subscriptions that were
> > transferred during pg_upgrade on all databases and recreate them (using the
> > existing slots on the publisher side obviously), allowing the initial
> > connection. But this approach only works in the exact scenario I mentioned
> > (physical to logical replication, or at least a case where *all* the tables
> > where logically replicated prior to the pg_ugprade), otherwise you have to
> > recreate the follower node from scratch using logical repication.
> >
>
> I think if you dropped and recreated the subscriptions by retaining
> old slots, the replication should resume from where it left off before
> the upgrade. Which scenario are you concerned about?

I'm concerned about people not coming from physical replication. If you just
had some "normal" logical replication, you can't assume that you already have
all the data from the upstream subscription. If it was modified and a non
empty table is added, you might need to copy the data of part of the tables and
keep replicating for the rest. It's hard to be sure from a user point of view,
and even if you knew you have no way to express it.

> > Is that indeed the current behavior, or did I miss something?
> >
> > Is this "resume logical replication on pg_upgraded node" something we want to
> > support better? I was thinking that we could add a new pg_dump mode (maybe
> > only usable during pg_upgrade) that also restores the pg_subscription_rel
> > content in each subscription or something like that. If not, should pg_upgrade
> > keep preserving the subscriptions as it doesn't seem safe to use them, or at
> > least document the hazards (I didn't find anything about it in the
> > documentation)?
> >
> >
>
> There is a mention of this in pg_dump docs. See [1] (When dumping
> logical replication subscriptions ...)

Indeed, but it's barely saying "It is then up to the user to reactivate the
subscriptions in a suitable way" and "It might also be appropriate to truncate
the target tables before initiating a new full table copy". As I mentioned, I
don't think there's a suitable way to reactivate the subscription, at least if
you don't want to miss some records, so truncating all target tables is the
only fully safe way to proceed. It seems quite silly to have to do so just
because pg_upgrade doesn't retain the list of relation per subscription.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2023-02-17 15:46:02 Re: pg_stat_statements and "IN" conditions
Previous Message Tom Lane 2023-02-17 15:09:35 Re: The output sql generated by pg_dump for a create function refers to a modified table name