Re: WIP/PoC for parallel backup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>
Cc: Kashif Zeeshan <kashif(dot)zeeshan(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP/PoC for parallel backup
Date: 2020-04-21 11:48:17
Message-ID: CAA4eK1+qEzMDKirv+X8e9a6vTGAiumm_9UgMpT-_Sa5Y9G-V3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman <asifr(dot)rehman(at)gmail(dot)com> wrote:
>
> I did some tests a while back, and here are the results. The tests were done to simulate
> a live database environment using pgbench.
>
> machine configuration used for this test:
> Instance Type: t2.xlarge
> Volume Type : io1
> Memory (MiB) : 16384
> vCPU # : 4
> Architecture : X86_64
> IOP : 16000
> Database Size (GB) : 102
>
> The setup consist of 3 machines.
> - one for database instances
> - one for pg_basebackup client and
> - one for pgbench with some parallel workers, simulating SELECT loads.
>
> basebackup | 4 workers | 8 Workers | 16 workers
> Backup Duration(Min): 69.25 | 20.44 | 19.86 | 20.15
> (pgbench running with 50 parallel client simulating SELECT load)
>
> Backup Duration(Min): 154.75 | 49.28 | 45.27 | 20.35
> (pgbench running with 100 parallel client simulating SELECT load)
>

Thanks for sharing the results, these show nice speedup! However, I
think we should try to find what exactly causes this speed up. If you
see the recent discussion on another thread related to this topic,
Andres, pointed out that he doesn't think that we can gain much by
having multiple connections[1]. It might be due to some internal
limitations (like small buffers) [2] due to which we are seeing these
speedups. It might help if you can share the perf reports of the
server-side and pg_basebackup side. We don't need pgbench type
workload to see what caused speed up.

[1] - https://www.postgresql.org/message-id/20200420201922.55ab7ovg6535suyz%40alap3.anarazel.de
[2] - https://www.postgresql.org/message-id/20200421064420.z7eattzqbunbutz3%40alap3.anarazel.de

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-04-21 11:49:45 Re: WIP/PoC for parallel backup
Previous Message legrand legrand 2020-04-21 11:37:12 Re: pg_stat_statements: rows not updated for CREATE TABLE AS SELECT statements