Quick Links

Re: Using failover slots for PG-non_PG logical replication

From:	Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	shveta malik <shveta(dot)malik(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Subject:	Re: Using failover slots for PG-non_PG logical replication
Date:	2025-07-03 13:37:36
Message-ID:	CAExHW5s_JZdPG5ZrJn9SSd50okB1oA0sGawM8MhFvB3T9zRB=A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jul 3, 2025 at 9:32 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jul 2, 2025 at 5:50 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Wed, Jul 2, 2025 at 12:36 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Jul 2, 2025 at 10:50 AM Ashutosh Bapat
> > > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > The failover slots documentation [1] is good for PG - PG logical
> > > > replication, but the first two queries require pg_subscription which
> > > > may not be present in non-PG downstream. Somebody looking to setup
> > > > failover slots for non-PG subscriber may not find the page useful.
> > >
> > > Okay. It appears to me that the entire document at [1] is
> > > specifically intended for a built-in replication setup, and the
> > > corresponding page was written with that context in mind.
> > >
> > > > However, the third query, when modified to mention the replication
> > > > slots relevant to the downstream is useful to them. How to find the
> > > > replication slots to be synchronized is a problem specific to the type
> > > > of downstream. Such a setup should add those slots to
> > > > sync_replication_slots. I think the chapter should mention that the
> > > > 3rd query should also include the slots mentioned in
> > > > sync_replication_slots for PG-non_PG logical replication setup.
> > > >
> > >
> > > sync_replication_slots is a boolean which enables a physical standby
> > > to synchronize logical failover slots. Did you mean something else?
> >
> > I confused this with the actual list of slots to be synchronized.
> > Sorry for that. The slots to be synchronized can be obtained from the
> > primary by querying pg_replication_slots with failover = true.
> >
>
> Note that primary may have slots corresponding to multiple subscriber
> nodes, so querying all slots on primary will give a correct answer may
> depend on the use case. For example, say a user wants to do some sort
> of load balancing such that some of the subscriber/downstream nodes
> are served by a standby, then directly querying all slots from
> pg_replication_slots from the primary won't give the correct answer.
> In a typical failover case as well, if slots corresponding to a
> particular downstream are ready, then that should be sufficient to
> continue replication from the standby. Then, also, there is a case
> when the primary node is down, then such a query won't work; it can
> only work when there is a planned switchover.

I think there are two different points of views. Section 29.3 is
written from a single subscriber's point of view i.e. whether a given
subscriber can continue logical replication from the new primary after
failover? The other view is from primary's point of view i.e. if
primary fails over will all the subscribers be able to continue
replication? For example, in case of a planned failover, the failover
orchestrator can check whether all the replication slots have been
synchronized or not. If so then it goes ahead with the failover. I
think the section is the right place to guide in this case as well.

> Considering all these
> points, I am not sure if it is a good idea to mention querying the
> primary for all slots marked with failover=true. However, I agree that
> we should mention something for non-native logical replication
> solutions, something on the lines of what Shveta is proposing. OTOH,
> if you or Shveta have some clear guidelines for how a downstream can
> find the required slots which can work in all or most cases, then it
> is okay to mention that as well.
>

How about this:
We change the following sentence in the third paragraph
To confirm that the standby server is indeed ready for failover <new
addition> so that a given PostgreSQL subscriber can continue logical
replication </new addition>, follow ... . <new addition> A
non-PostgreSQL downstream may need to device a different way to find
the slots corresponding to its subscriptions or use the next section.

Then add a separate paragraph at the end or a separate section like below.

In order to check whether a standby server is ready for failover so
that all the subscribers, PostgreSQL as well as non-PostgreSQL, can
continue logical replication, follow these steps make sure that all
the replication slots, on the primary server, that have property
failover = true are synchronized to the standby server.
1. On the primary server run following query
select slot_name from pg_replication_slots where failover and NOT temporary

2. Check that the logical replication slots identified above exist on
the standby server and are ready for failover.
SELECT slot_name, (synced AND NOT temporary AND NOT conflicting) AS
failover_ready
FROM pg_replication_slots
WHERE slot_name IN

Does that look good?

--
Best Wishes,
Ashutosh Bapat

In response to

Re: Using failover slots for PG-non_PG logical replication at 2025-07-03 04:02:10 from Amit Kapila

Responses

Re: Using failover slots for PG-non_PG logical replication at 2025-07-04 03:53:17 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nazir Bilal Yavuz	2025-07-03 13:50:27	Re: Explicitly enable meson features in CI
Previous Message	Andy Fan	2025-07-03 13:31:23	Re: A assert failure when initdb with track_commit_timestamp=on