Re: WIP/PoC for parallel backup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
Cc: Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, David Zhang <david(dot)zhang(at)highgo(dot)ca>, Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>, Kashif Zeeshan <kashif(dot)zeeshan(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP/PoC for parallel backup
Date: 2020-05-21 06:53:56
Message-ID: CAA4eK1+PLW-d1FKTMjrAufHjx4hgfdvn0+ukdgXD9yWFnvHviQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 21, 2020 at 11:36 AM Rushabh Lathia
<rushabh(dot)lathia(at)gmail(dot)com> wrote:
>
> On Thu, May 21, 2020 at 10:47 AM Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com> wrote:
>>
>>>>
>>>> During an offlist discussion with Robert, he pointed out that current
>>>> basebackup's code doesn't account for the wait event for the reading
>>>> of files which can change what pg_stat_activity shows? Can you please
>>>> apply his latest patch to improve basebackup.c's code [1] which will
>>>> take care of that waitevent before getting the data again?
>>>>
>>>> [1] - https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com
>>>
>>>
>>>
>>> Sure, we can try out this and do a similar run to collect the pg_stat_activity output.
>>
>>
>> Have you had the chance to try this out?
>
>
> Yes. My colleague Suraj tried this and here are the pg_stat_activity output files.
>
> Captured wait events after every 3 seconds during the backup for -
> 1: parallel backup for 100GB data with 4 workers (pg_stat_activity_normal_backup_100GB.txt)
> 2: Normal backup (without parallel backup patch) for 100GB data (pg_stat_activity_j4_100GB.txt)
>
> Here is the observation:
>
> The total number of events (pg_stat_activity) captured during above runs:
> - 314 events for normal backups
> - 316 events for parallel backups (-j 4)
>
> BaseBackupRead wait event numbers: (newly added)
> 37 - in normal backups
> 25 - in the parallel backup (-j 4)
>
> ClientWrite wait event numbers:
> 175 - in normal backup
> 1098 - in parallel backups
>
> ClientRead wait event numbers:
> 0 - ClientRead in normal backup
> 326 - ClientRead in parallel backups for diff processes. (all in idle state)
>

It might be interesting to see why ClientRead/ClientWrite has
increased so much and can we reduce it?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2020-05-21 07:12:06 speed up unicode normalization quick check
Previous Message Julien Rouhaud 2020-05-21 06:49:53 Re: Planning counters in pg_stat_statements (using pgss_store)