Re: Parallel Inserts in CREATE TABLE AS

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2021-05-26 11:34:45
Message-ID: CALj2ACX1Kht4wVJc0pacwafEkd6gCcx+Op5a5XoaEWYR_vw_jA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 25, 2021 at 1:10 PM tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
>
> Although this should be a controversial and may be crazy idea, the following change brought 4-11% speedup. This is because I thought parallel workers might contend for WAL flush as a result of them using the limited ring buffer and flushing dirty buffers when the ring buffer is filled. Can we take advantage of this?
>
> [GetBulkInsertState]
> /* bistate->strategy = GetAccessStrategy(BAS_BULKWRITE);*/
> bistate->strategy = NULL;

You are right. If ring buffer(16MB) is not used and shared
buffers(1GB) are used instead, in your case since the table size is
335MB and it can fit in the shared buffers, there will not be any or
will be very minimal dirty buffer flushing, so there will be more some
more speedup.

Otherwise, the similar speed up can be observed when the BAS_BULKWRITE
is increased a bit from the current 16MB to some other reasonable
value. I earlier tried these experiments.

Otherwise, as I said in [1], we can also increase the number of extra
blocks added at a time, say Min(1024, lockWaiters * 128/256/512) than
currently extraBlocks = Min(512, lockWaiters * 20);. This will also
give some speedup and we don't see any regression with parallel
inserts in CTAS patches.

But, I'm not so sure that the hackers will agree any of the above as a
practical solution to the "relation extension" problem.

[1] https://www.postgresql.org/message-id/CALj2ACVdcrjwHXwvJqT-Fa32vnJEOjteep_3L24X8MK50E7M8w%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-05-26 11:42:53 Re: Parallel Inserts in CREATE TABLE AS
Previous Message Bharath Rupireddy 2021-05-26 11:22:04 Re: Parallel Inserts in CREATE TABLE AS