Re: Parallel Inserts in CREATE TABLE AS

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2021-03-19 07:31:21
Message-ID: CALj2ACWFvNm4d_uqT2iECPqaXZjEd-O+y8xbghvqXeMLj0pxGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 19, 2021 at 12:45 PM tanghy(dot)fnst(at)fujitsu(dot)com
<tanghy(dot)fnst(at)fujitsu(dot)com> wrote:
>
> From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
> >I analyzed performance of parallel inserts in CTAS for different cases
> >with tuple size 32bytes, 59bytes, 241bytes and 1064bytes. We could
> >gain if the tuple sizes are lower. But if the tuple size is larger
> >i..e 1064bytes, there's a regression with parallel inserts.
>
> Thanks for the update.
> BTW, May be you have some more testcases that can reproduce this regression easily.
> Can you please share some of the testcase (with big tuple size) with me.

They are pretty simple though. I think someone can also check if the
same regression exists for parallel inserts in "INSERT INTO SELECT"
patch set as well for larger tuple sizes.

[1]
DROP TABLE tenk1;
CREATE UNLOGGED TABLE tenk1(c1 int, c2 int);
INSERT INTO tenk1 values(generate_series(1,100000000),
generate_series(1,100000000));
explain analyze verbose create table test as select * from tenk1;

DROP TABLE tenk1;
CREATE UNLOGGED TABLE tenk1(c1 int, c2 int, c3 varchar(8), c4
varchar(8), c5 varchar(8));
INSERT INTO tenk1 values(generate_series(1,100000000),
generate_series(1,100000000),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)));
explain analyze verbose create table test as select * from tenk1;

DROP TABLE tenk1;
CREATE UNLOGGED TABLE tenk1(c1 bigint, c2 bigint, c3 name, c4 name, c5
name, c6 varchar(8));
INSERT INTO tenk1 values(generate_series(1,100000000),
generate_series(1,100000000),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)));
explain analyze verbose create table test as select * from tenk1;

DROP TABLE tenk1;
CREATE UNLOGGED TABLE tenk1(c1 bigint, c2 bigint, c3 name, c4 name, c5
name, c6 name, c7 name, c8 name, c9 name, c10 name, c11 name, c12
name, c13 name, c14 name, c15 name, c16 name, c17 name, c18 name);
INSERT INTO tenk1 values(generate_series(1,10000000),
generate_series(1,10000000),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)),
upper(substring(md5(random()::varchar),2,8)));
explain analyze verbose create unlogged table test as select * from tenk1;

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-03-19 07:34:57 Re: fdatasync performance problem with large number of DB files
Previous Message Fujii Masao 2021-03-19 07:30:04 Re: About to add WAL write/fsync statistics to pg_stat_wal view