Re: Synchronizing slots from primary to standby

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-07-24 03:30:15
Message-ID: CAA4eK1J2h_KhaJNG-Q1W38pgHHpOV+aaC9VN-GuLPTyRfUtgDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 24, 2023 at 8:03 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Fri, Jul 21, 2023 at 5:16 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > Thanks Bharat for letting us know. It is okay to split the patch, it
> > may definitely help to understand the modules better but shall we take
> > a step back and try to reevaluate the design first before moving to
> > other tasks?
>
> Agree that design comes first. FWIW, I'm attaching the v9 patch set
> that I have with me. It can't be a perfect patch set unless the design
> is finalized.
>
> > I analyzed more on the issues stated in [1] for replacing LIST_SLOTS
> > with SELECT query. On rethinking, it might not be a good idea to
> > replace this cmd with SELECT in Launcher code-path
>
> I think there are open fundamental design aspects, before optimizing
> LIST_SLOTS, see below. I'm sure we can come back to this later.
>
> > Secondly, I was thinking if the design proposed in the patch is the
> > best one. No doubt, it is the most simplistic design and thus may
> > .......... Any feedback is appreciated.
>
> Here are my thoughts about this feature:
>
> Current design:
>
> 1. On primary, never allow walsenders associated with logical
> replication slots to go ahead of physical standbys that are candidates
> for future primary after failover. This enables subscribers to connect
> to new primary after failover.
> 2. On all candidate standbys, periodically sync logical slots from
> primary (creating the slots if necessary) with one slot sync worker
> per logical slot.
>
> Important considerations:
>
> 1. Does this design guarantee the row versions required by subscribers
> aren't removed on candidate standbys as raised here -
> https://www.postgresql.org/message-id/20220218222319.yozkbhren7vkjbi5%40alap3.anarazel.de?
>
> It seems safe with logical decoding on standbys feature. Also, a
> test-case from upthread is already in patch sets (in v9 too)
> https://www.postgresql.org/message-id/CAAaqYe9FdKODa1a9n%3Dqj%2Bw3NiB9gkwvhRHhcJNginuYYRCnLrg%40mail.gmail.com.
> However, we need to verify the use cases extensively.
>

Agreed.

> 2. All candidate standbys will start one slot sync worker per logical
> slot which might not be scalable.
>

Yeah, that doesn't sound like a good idea but IIRC, the proposed patch
is using one worker per database (for all slots corresponding to a
database).

> Is having one (or a few more - not
> necessarily one for each logical slot) worker for all logical slots
> enough?
>

I guess for a large number of slots the is a possibility of a large
gap in syncing the slots which probably means we need to retain
corresponding WAL for a much longer time on the primary. If we can
prove that the gap won't be large enough to matter then this would be
probably worth considering otherwise, I think we should find a way to
scale the number of workers to avoid the large gap.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-24 05:09:01 Re: Use COPY for populating all pgbench tables
Previous Message Amit Kapila 2023-07-24 03:04:55 Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.