Re: pg_upgrade and logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_upgrade and logical replication
Date: 2023-02-18 10:42:52
Message-ID: CAA4eK1KW38JFR=PF47cDGmGyEUQPCbRXJ_wrT164ckv-8u1ijg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 18, 2023 at 11:21 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Sat, Feb 18, 2023 at 09:31:30AM +0530, Amit Kapila wrote:
> > On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > >
> > > I'm concerned about people not coming from physical replication. If you just
> > > had some "normal" logical replication, you can't assume that you already have
> > > all the data from the upstream subscription. If it was modified and a non
> > > empty table is added, you might need to copy the data of part of the tables and
> > > keep replicating for the rest. It's hard to be sure from a user point of view,
> > > and even if you knew you have no way to express it.
> > >
> >
> > Can't the user create a separate publication for such newly added
> > tables and a corresponding new subscription on the downstream node?
>
> Yes that seems like a safe way to go, but it relies on users being very careful
> if they don't want to get corrupted logical standby, and I think it's
> impossible to run any check to make sure that the subscription is adequate?
>

I can't think of any straightforward way but one can probably take of
dump of data on both nodes using pg_dump and then compare it.

> > Now, I think it would be a bit tricky if the user already has a
> > publication defined with FOR ALL TABLES. In that case, we probably
> > need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
> > we currently don't have.
>
> Yes, and note that I rely on FOR ALL TABLES for my original physical to logical
> use case.
>

Okay, but if we would have functionality like EXCEPT (list of tables),
one could do ALTER PUBLICATION .. before doing REFRESH on the
subscriber-side.

> > >
> > > Indeed, but it's barely saying "It is then up to the user to reactivate the
> > > subscriptions in a suitable way" and "It might also be appropriate to truncate
> > > the target tables before initiating a new full table copy". As I mentioned, I
> > > don't think there's a suitable way to reactivate the subscription, at least if
> > > you don't want to miss some records, so truncating all target tables is the
> > > only fully safe way to proceed. It seems quite silly to have to do so just
> > > because pg_upgrade doesn't retain the list of relation per subscription.
> > >
> >
> > I also don't know if there is any other safe way for newly added
> > tables apart from the above suggestion to create separate publications
> > but that can work only in specific cases.
>
> I might be missing something, but what could go wrong if pg_upgrade could emit
> a bunch of commands like:
>
> ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';
>

How will we know the STATE and LSN of each relation? But I think even
if know that what is the guarantee that publisher side still has still
retained the corresponding slots?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-02-18 12:19:49 Re: PATCH: Using BRIN indexes for sorted output
Previous Message Robert Haas 2023-02-18 10:21:06 Re: Weird failure with latches in curculio on v15