Re: WIP/PoC for parallel backup

From: Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP/PoC for parallel backup
Date: 2020-02-17 08:39:08
Message-ID: CADM=JegCQ_k83DbtXcD--8Au6516u1Di6xWmDbXTxS3qVBp83g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Jeevan. Here is the documentation patch.

On Mon, Feb 10, 2020 at 6:49 PM Jeevan Chalke <
jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:

> Hi Asif,
>
> On Thu, Jan 30, 2020 at 7:10 PM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>
> wrote:
>
>>
>> Here are the the updated patches, taking care of the issues pointed
>> earlier. This patch adds the following commands (with specified option):
>>
>> START_BACKUP [LABEL '<label>'] [FAST]
>> STOP_BACKUP [NOWAIT]
>> LIST_TABLESPACES [PROGRESS]
>> LIST_FILES [TABLESPACE]
>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>> SEND_FILES '(' FILE, FILE... ')' [START_WAL_LOCATION 'X/X']
>> [NOVERIFY_CHECKSUMS]
>>
>>
>> Parallel backup is not making any use of tablespace map, so I have
>> removed that option from the above commands. There is a patch pending
>> to remove the exclusive backup; we can further refactor the
>> do_pg_start_backup
>> function at that time, to remove the tablespace information and move the
>> creation of tablespace_map file to the client.
>>
>>
>> I have disabled the maxrate option for parallel backup. I intend to send
>> out a separate patch for it. Robert previously suggested to implement
>> throttling on the client-side. I found the original email thread [1]
>> where throttling was proposed and added to the server. In that thread,
>> it was originally implemented on the client-side, but per many
>> suggestions,
>> it was moved to server-side.
>>
>> So, I have a few suggestions on how we can implement this:
>>
>> 1- have another option for pg_basebackup (i.e. per-worker-maxrate) where
>> the user could choose the bandwidth allocation for each worker. This
>> approach
>> can be implemented on the client-side as well as on the server-side.
>>
>> 2- have the maxrate, be divided among workers equally at first. and the
>> let the main thread keep adjusting it whenever one of the workers
>> finishes.
>> I believe this would only be possible if we handle throttling on the
>> client.
>> Also, as I understand it, implementing this will introduce additional
>> mutex
>> for handling of bandwidth consumption data so that rate may be adjusted
>> according to data received by threads.
>>
>> [1]
>> https://www.postgresql.org/message-id/flat/521B4B29.20009%402ndquadrant.com#189bf840c87de5908c0b4467d31b50af
>>
>> --
>> Asif Rehman
>> Highgo Software (Canada/China/Pakistan)
>> URL : www.highgo.ca
>>
>>
>
> The latest changes look good to me. However, the patch set is missing the
> documentation.
> Please add those.
>
> Thanks
>
> --
> Jeevan Chalke
> Associate Database Architect & Team Lead, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>

--
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Attachment Content-Type Size
0006-parallel-backup-documentation.patch application/octet-stream 15.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-02-17 08:44:20 Re: Fix compiler warnings on 64-bit Windows
Previous Message Amit Langote 2020-02-17 08:21:08 Re: [PoC] Non-volatile WAL buffer