Re: Parallel Inserts in CREATE TABLE AS

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2021-01-27 08:17:06
Message-ID: CALj2ACXQ_wfQBzaVqrTa+Tg2PwVke78-Y_qQZYgVijtM1S_GEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 27, 2021 at 1:25 PM Tang, Haiying
<tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com> wrote:
> I choose 5 cases which pick parallel insert plan in CTAS to measure the patched performance. Each case run 30 times.
>
> Most of the tests execution become faster with this patch.
>
> However, Test NO 4(create table xxx as table xxx.) appears performance degradation. I tested various table size(2/10/20 millions), they all have a 6%-10% declines. I think it may need some check at this problem.
>
>
>
> Below are my test results. 'Test NO' is corresponded to 'Test NO' in attached test_ctas.sql file.
>
> reg%=(patched-master)/master
>
> Test NO | Test Case |reg% | patched(ms) | master(ms)
>
> --------|--------------------------------|------|--------------|-------------
>
> 1 | CTAS select from table | -9% | 16709.50477 | 18370.76660
>
> 2 | Append plan | -14% | 16542.97807 | 19305.86600
>
> 3 | initial plan under Gather node| -5% | 13374.27187 | 14120.02633
>
> 4 | CTAS table | 10% | 20835.48800 | 18986.40350
>
> 5 | CTAS select from execute | -6% | 16973.73890 | 18008.59789
>
>
>
> About Test NO 4:
>
> In master(HEAD), this test case picks serial seq scan.
>
> query plan likes:
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Seq Scan on public.tenk1 (cost=0.00..444828.12 rows=10000012 width=244) (actual time=0.005..1675.268 rows=10000000 loops=1)
>
> Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4 Planning Time: 0.053 ms Execution Time: 20165.023 ms
>
>
>
> With this patch, it will choose parallel seq scan and parallel insert.
>
> query plan likes:
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Gather (cost=1000.00..370828.03 rows=10000012 width=244) (actual time=20428.823..20437.143 rows=0 loops=1)
>
> Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
>
> Workers Planned: 4
>
> Workers Launched: 4
>
> -> Create test
>
> -> Parallel Seq Scan on public.tenk1 (cost=0.00..369828.03 rows=2500003 width=244) (actual time=0.021..411.094 rows=2000000 loops=5)
>
> Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
>
> Worker 0: actual time=0.023..390.856 rows=1858407 loops=1
>
> Worker 1: actual time=0.024..468.587 rows=2264494 loops=1
>
> Worker 2: actual time=0.023..473.170 rows=2286580 loops=1
>
> Worker 3: actual time=0.027..373.727 rows=1853216 loops=1 Planning Time: 0.053 ms Execution Time: 20437.643 ms
>
>
>
> test machine spec:
>
> CentOS 8.2, 128G RAM, 40 processors, disk SAS

Thanks a lot for the performance tests and test cases. I will analyze
why the performance is degrading one case and respond soon.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Takashi Menjo 2021-01-27 08:28:25 Re: [PoC] Non-volatile WAL buffer
Previous Message Fujii Masao 2021-01-27 08:08:56 Re: [PATCH] remove pg_standby