Re: Tables getting stuck at 's' state during logical replication

From: Padmavathi G <padma9(dot)9(dot)1999(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Tables getting stuck at 's' state during logical replication
Date: 2023-05-05 13:56:56
Message-ID: CABBh-xR90AJ0s4u9HY4xErS1KHCDDUZ1W5sJX0fktdyCrpGQEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some background on the setup on which I am trying to carry out the upgrade:

We have a pod in a kubernetes cluster which contains the postgres 11 image.
We are following the logical replication process for upgrade

Steps followed for logical replication:

1. Created a new pod in the same kubernetes cluster with the latest
postgres 15 image
2. Created a publication (say publication 1) in the old pod including all
tables in a database
3. Created a subscription (say subscription 1) in the new pod for the above
mentioned publication
4. When monitoring the subscription via pg_subscription_rel in the
subscriber, I noticed that out of 45 tables 20 were in the 'r' state and 25
were in 's' state and they remained in the same state for almost 2 days,
there was no improvement in the state. But the logs showed that the tables
which had 's' state also had "synchronization workers for <table_name>
finished".
5. Then I removed the tables which got stuck in the 's' state from
publication 1 and created a new publication (publication 2) with only these
tables which got stuck and created a new subscription (subscription 2) for
this publication in the subscriber.
6. Now on monitoring subscription 2 via pg_subscription_rel I noticed that
out of 25, now 12 were in 'r' state and 13 again got stuck in 's' state.
Repeated this process of dropping tables which got stuck from publication
and created a new publisher and subscriber and finally I was able to bring
all tables to sync in this way. But still the tables were present in
replication origin.
7. On executing pg_replication_origins command, I saw that every
subscription had one origin and every table which got stuck in each
publication had one origin with roname pg_<subid>_<relid>. Eventhough they
were stuck, these replication origins were not removed.

On Fri, May 5, 2023 at 4:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Fri, May 5, 2023 at 3:04 PM Padmavathi G <padma9(dot)9(dot)1999(at)gmail(dot)com>
> wrote:
> >
> > Hello,
> > I am trying out logical replication for upgrading postgres instance from
> version 11 to 15.x. In the process, I noticed that some tables get stuck in
> the 's' state during logical replication and they do not move to the 'r'
> state. I tried to drop the subscription and create a new subscriber, but
> the same thing repeated. Also, each time different tables get stuck, it is
> not like the same tables get stuck every time.
> >
>
> This is strange. BTW, we don't save slots after the upgrade, so the
> subscriptions in the upgraded node won't be valid. We have some
> discussion on this topic in threads [1][2]. So, I think after the
> upgrade one needs to anyway re-create the subscriptions. Can you share
> your exact steps for the upgrade and what is the state before the
> upgrade? Is it possible to share some minimal test case to show the
> exact problem you are facing?
>
> [1] -
> https://www.postgresql.org/message-id/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud
> [2] -
> https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com
>
>
> --
> With Regards,
> Amit Kapila.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message gkokolatos 2023-05-05 14:13:28 Re: Add LZ4 compression in pg_dump
Previous Message Tom Lane 2023-05-05 13:36:41 MERGE lacks ruleutils.c decompiling support!?