From: | Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, David Zhang <david(dot)zhang(at)highgo(dot)ca>, Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>, Kashif Zeeshan <kashif(dot)zeeshan(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP/PoC for parallel backup |
Date: | 2020-05-04 13:22:37 |
Message-ID: | CAGPqQf3zzOgxsoozY439QqQf6MGj01he9_A9Ngipgd3PL1RqMw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Apr 30, 2020 at 4:15 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Apr 29, 2020 at 6:11 PM Suraj Kharage
> <suraj(dot)kharage(at)enterprisedb(dot)com> wrote:
> >
> > Hi,
> >
> > We at EnterpriseDB did some performance testing around this parallel
> backup to check how this is beneficial and below are the results. In this
> testing, we run the backup -
> > 1) Without Asif’s patch
> > 2) With Asif’s patch and combination of workers 1,2,4,8.
> >
> > We run those test on two setup
> >
> > 1) Client and Server both on the same machine (Local backups)
> >
> > 2) Client and server on a different machine (remote backups)
> >
> >
> > Machine details:
> >
> > 1: Server (on which local backups performed and used as server for
> remote backups)
> >
> > 2: Client (Used as a client for remote backups)
> >
> >
> ...
> >
> >
> > Client & Server on the same machine, the result shows around 50%
> improvement in parallel run with worker 4 and 8. We don’t see the huge
> performance improvement with more workers been added.
> >
> >
> > Whereas, when the client and server on a different machine, we don’t see
> any major benefit in performance. This testing result matches the testing
> results posted by David Zhang up thread.
> >
> >
> >
> > We ran the test for 100GB backup with parallel worker 4 to see the CPU
> usage and other information. What we noticed is that server is consuming
> the CPU almost 100% whole the time and pg_stat_activity shows that server
> is busy with ClientWrite most of the time.
> >
> >
>
> Was this for a setup where the client and server were on the same
> machine or where the client was on a different machine? If it was for
> the case where both are on the same machine, then ideally, we should
> see ClientRead events in a similar proportion?
>
In the particular setup, the client and server were on different machines.
> During an offlist discussion with Robert, he pointed out that current
> basebackup's code doesn't account for the wait event for the reading
> of files which can change what pg_stat_activity shows? Can you please
> apply his latest patch to improve basebackup.c's code [1] which will
> take care of that waitevent before getting the data again?
>
> [1] -
> https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com
>
Sure, we can try out this and do a similar run to collect the
pg_stat_activity output.
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
>
--
Rushabh Lathia
From | Date | Subject | |
---|---|---|---|
Next Message | Juan José Santamaría Flecha | 2020-05-04 13:29:15 | Re: PG compilation error with Visual Studio 2015/2017/2019 |
Previous Message | Michael Paquier | 2020-05-04 12:18:26 | Re: Postgres Windows build system doesn't work with python installed in Program Files |