Re: pg_upgrade and publication/subscription problem

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade and publication/subscription problem
Date: 2021-11-25 13:55:37
Message-ID: CAA4eK1L+zbu63egsQWjMCASbqbsY5OKn-yr_fuZcU8y+PJCbKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 25, 2021 at 5:13 PM Marcos Pegoraro <marcos(at)f10(dot)com(dot)br> wrote:
>
> A publication for all tables was running fine, Master is a PostgreSQL 11.11. Replica was running version 13 (don´t remember minor version).
>
> Then we tried to update only subscriber server, nothing was done on master side.
>
> Then we did ...
> - installed postgresql-14.
> - configured postgresql.conf to be similar to previous.
> - on version 13 disabled subscription - alter subscription disable.
> - changed both port to run pg_upgrade.
> - stop services for both 13 e 14.
> - /usr/lib/postgresql/14/bin/pg_upgrade -b /usr/lib/postgresql/13/bin -B /usr/lib/postgresql/14/bin -d /etc/postgresql/13/main/ -D /etc/postgresql/14/main/ -j 2 --link -p 9999 -P 9998 -U postgres -v
> - when finished upgrade process, we removed version 13 and ran vacuumdb -p 9998 -U postgres --all --analyze-in-stages
> - last step was to enable that subscription.
> - just wait for the subscriber to get the data changed, pg_upgrade ran for 15 minutes, this should be synced in a few seconds ...
> - few seconds later we remembered that some other tables were created on publication server, so we did a refresh publication.
>
> Then, some minutes later we got lots of log entries "duplicate key value violates unique constraint pk..." because it was trying to COPY that table from master.
>
> We disable subscription again until we solve, as remains.
>
> Selecting from pg_subscription_rel all old tables are with srsubstate i for initialize, not s for synchronized or r for ready, as they should. And all srsublsn of these records were null, so it lost synchronization coordination for all tables which existed before this upgrade process.
>

The reason is after an upgrade, there won't be any data in
pg_subscription_rel, and only when you tried to refresh it is trying
to sync again which leads to the "duplicate key value ..." problem you
are seeing.

> So, my first question is, as our publication server continues running, lots of updates were processed, so how can I synchronize both sides without recreating that publication ?
>

Don't you want to eventually upgrade the publisher node as well? You
can refer to blog [1] for the detailed steps.

> And my second question is, is this problem documented ? Is this problem expected to happen ?
>

Yes, the way you are doing I think it is bound to happen. There is
some discussion about why this is happening in email [2]. AFAIK, it is
not documented and if so, I think it will be a good idea to document
it.

[1] - https://elephanttamer.net/?p=58
[2] - https://www.postgresql.org/message-id/CALDaNm2-SRGHK0rqJQu7rGiS4hDAb7Nib5HbojEN5ubaXGs2CA%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2021-11-25 14:02:22 Re: POC: Cleaning up orphaned files using undo logs
Previous Message houzj.fnst@fujitsu.com 2021-11-25 13:39:45 RE: row filtering for logical replication