RE: Parallel Inserts in CREATE TABLE AS

From: "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: RE: Parallel Inserts in CREATE TABLE AS
Date: 2021-01-22 11:46:28
Message-ID: 34549865667a4a3bb330ebfd035f85d3@G08CNEXMBPEKD05.g08.fujitsu.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Bharath,

I'm trying to take some performance measurements on you patch v23.
But when I started, I found an issue about the tuples unbalance distribution among workers(99% tuples read by one worker) under specified case which lead the "parallel select" part makes no performance gain.
Then I find it's not introduced by your patch, because it's also happening in master(HEAD). But I don't know how to deal with it , so I put it here to see if anybody know what's going wrong with this or have good ideas to deal this issue.

Here are the conditions to produce the issue:
1. high CPU spec environment(say above 20 processors). In smaller CPU, it also happen but not so obvious(40% tuples on one worker in my tests).
2. query plan is "serial insert + parallel select", I have reproduce this behavior in (CTAS, Select into, insert into select).
3. select part needs to query large data size(e.g. query 100 million from 200 million).

According to above, IMHO, I guess it may be caused by the leader write rate can't catch the worker read rate, then the tuples of one worker blocked in the queue, become more and more.

Below is my test info:
1. test spec environment
CentOS 8.2, 128G RAM, 40 processors, disk SAS

2. test data prepare
create table x(a int, b int, c int);
create index on x(a);
insert into x select generate_series(1,200000000),floor(random()*(10001-1)+1),floor(random()*(10001-1)+1);

3. test execute results
*Patched CTAS*: please look at worker 2, 99% tuples read by it.
explain analyze verbose create table test(a,b,c) as select a,floor(random()*(10001-1)+1),c from x where b%2=0;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..1942082.77 rows=1000001 width=16) (actual time=0.203..24023.686 rows=100006268 loops=1)
Output: a, floor(((random() * '10000'::double precision) + '1'::double precision)), c
Workers Planned: 4
Workers Launched: 4
-> Parallel Seq Scan on public.x (cost=0.00..1831082.66 rows=250000 width=8) (actual time=0.016..4367.035 rows=20001254 loops=5)
Output: a, c
Filter: ((x.b % 2) = 0)
Rows Removed by Filter: 19998746
Worker 0: actual time=0.016..19.265 rows=94592 loops=1
Worker 1: actual time=0.027..31.422 rows=94574 loops=1
Worker 2: actual time=0.014..21744.549 rows=99627749 loops=1
Worker 3: actual time=0.015..19.347 rows=94586 loops=1 Planning Time: 0.098 ms Execution Time: 91054.828 ms

*Non-patched CTAS*: please look at worker 0, also 99% tuples read by it.
explain analyze verbose create table test(a,b,c) as select a,floor(random()*(10001-1)+1),c from x where b%2=0;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..1942082.77 rows=1000001 width=16) (actual time=0.283..19216.157 rows=100003148 loops=1)
Output: a, floor(((random() * '10000'::double precision) + '1'::double precision)), c
Workers Planned: 4
Workers Launched: 4
-> Parallel Seq Scan on public.x (cost=0.00..1831082.66 rows=250000 width=8) (actual time=0.020..4380.360 rows=20000630 loops=5)
Output: a, c
Filter: ((x.b % 2) = 0)
Rows Removed by Filter: 19999370
Worker 0: actual time=0.013..21805.647 rows=99624833 loops=1
Worker 1: actual time=0.016..19.790 rows=94398 loops=1
Worker 2: actual time=0.013..35.340 rows=94423 loops=1
Worker 3: actual time=0.035..19.849 rows=94679 loops=1 Planning Time: 0.083 ms Execution Time: 91151.097 ms

I'm still working on the performance tests on your patch, if I make some progress, I will post my results here.

Regards,
Tang

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-01-22 11:49:24 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Chengxi Sun 2021-01-22 11:25:40 Re: Is it useful to record whether plans are generic or custom?