Re: Synchronizing slots from primary to standby

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-11-13 09:53:17
Message-ID: e7b63103-2a8c-4ee9-866a-ddba45ead388@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/13/23 5:24 AM, shveta malik wrote:
> On Thu, Nov 9, 2023 at 8:56 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> Apart from the above, I would like to discuss the slot sync work
>> distribution strategy of this patch. The current implementation as
>> explained in the commit message [1] works well if the slots belong to
>> multiple databases. It is clear from the data in emails [2][3][4] that
>> having more workers really helps if the slots belong to multiple
>> databases. But I think if all the slots belong to one or very few
>> databases then such a strategy won't be as good. Now, on one hand, we
>> get very good numbers for a particular workload with the strategy used
>> in the patch but OTOH it may not be adaptable to various different
>> kinds of workloads. So, I have a question whether we should try to
>> optimize this strategy for various kinds of workloads or for the first
>> version let's use a single-slot sync-worker and then we can enhance
>> the functionality in later patches either in PG17 itself or in PG18 or
>> later versions.
>
> I can work on separating the patch. We can first focus on single
> worker design and then we can work on multi-worker design either
> immediately (if needed) or we can target it in the second draft of the
> patch. I would like to know the thoughts of others on this.

If we need to put more thoughts on the workers distribution strategy
then I also think it's better to focus on a single worker and then
improve/discuss a distribution design later on.

>
> One thing to note is that a lot of the complexity of
>> the patch is attributed to the multi-worker strategy which may still
>> not be efficient, so there is an argument to go with a simpler
>> single-slot sync-worker strategy and then enhance it in future
>> versions as we learn more about various workloads. It will also help
>> to develop this feature incrementally instead of doing all the things
>> in one go and taking a much longer time than it should.
>
> Agreed. With multi-workers, a lot of complexity (dsa, locks etc) have
> come into play. We can decide better on our workload distribution
> strategy among workers once we have more clarity on different types of
> workloads.
>

Agreed.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-11-13 10:09:28 Re: serial and partitioned table
Previous Message Peter Eisentraut 2023-11-13 09:24:03 should check collations when creating partitioned index