Re: Parallelize stream replication process

From: Asim Praveen <pasim(at)vmware(dot)com>
To: Li Japin <japinli(at)hotmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallelize stream replication process
Date: 2020-09-16 11:43:14
Message-ID: 54AB2364-A227-4724-8104-1743F42A8009@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 16-Sep-2020, at 8:32 AM, Li Japin <japinli(at)hotmail(dot)com> wrote:
>
> Thanks for clarifying the questions!
>
>> On Sep 15, 2020, at 12:41 PM, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>>
>> I think we must ask few questions:
>>
>> 1. What's the major gain we get out of this? Is it that the time to
>> stream gets reduced or something else?
>
> I think when the database failover, we might shorten the recovery time from the parallel stream replication.
>
>> If the answer to the above point is something solid, then
>> 2. How do we distribute the work to multiple processes?
>> 3. Do we need all of the workers to maintain the order in which they
>> read WAL files(on the publisher) and apply the changes(on the
>> subscriber?)
>> 3. Do we want to map the sender/publisher workers to
>> receiver/subscriber workers on a one-to-one basis? If not, how do we
>> do it?
>> 4. How do sender and receiver workers communicate?
>> 5. What if we have multiple subscribers/receivers?
>>
>> I'm no expert in replication, I may be wrong as well. Others may have
>> better thoughts.
>>
>
> Maybe we can distribute the work to multiple processes according by the WAL record type.
>
> In the first step, I think we can parallel the replay process. We can classify the WAL by WAL type or RmgrId,
> and then parallel those WAL replay if possible.
>

This is a rather hard problem to solve, mainly because the (partial)
order inherent in the WAL stream must be preserved when distributing
subsets of WAL records for parallel replay. The order can be
characterised as follows:

(1) All records emitted by a transaction must be replayed before
replaying the commit record emitted by that transaction.

(2) Commit records emitted by different transactions must be replayed
in the order in which they appear in the WAL stream.

Asim

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-09-16 11:52:27 Re: [HACKERS] [PATCH] Generic type subscripting
Previous Message Lauri Svan 2020-09-16 11:34:26 Extending array intersection ops to bloom indexes