Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-05 13:04:36
Message-ID: CAJcOf-deq_XM1xjB3_nzQfowdEv2x+Q_skh=ufSYC-K_ME-tHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 5, 2021 at 9:35 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Mar 5, 2021 at 8:24 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> >
>
> In patch v21-0003-Add-new-parallel-dml-GUC-and-table-options, we are
> introducing GUC (enable_parallel_dml) and table option
> (parallel_dml_enabled) for this feature. I am a bit worried about
> using *_dml in the names because it is quite possible that for
> parallel updates and parallel deletes we might not need any such GUC.
> The reason we mainly need here is due to checking of parallel-safety
> of partitioned tables and updates/deletes handle partitioned tables
> differently than inserts so those might not be that costly. It is
> possible that they are costly due to a different reason but not sure
> mapping those to one GUC or table option is a good idea. Can we
> consider using *_insert instead? I think GUC having _insert can be
> probably used for a parallel copy (from) as well which I think will
> have a similar overhead.
>

I'll need to think about that one.
I may be wrong, but I would have thought at least updates would have
similar parallel-safety checking requirements to inserts and would
have similar potential cost issues.

For the time being at least, I am posting an updated set of patches,
as I found that the additional parallel-safety checks on DOMAIN check
constraints to be somewhat inefficient and could also be better
integrated into max_parallel_hazard(). I also updated the basic tests
with a test case for this.

Regards,
Greg Nancarrow
Fujitsu Australia

Attachment Content-Type Size
v22-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch application/octet-stream 33.4 KB
v22-0002-Parallel-SELECT-for-INSERT-INTO-.-SELECT-basic-tests-and-doc.patch application/octet-stream 32.3 KB
v22-0003-Add-new-parallel-dml-GUC-and-table-options.patch application/octet-stream 19.5 KB
v22-0004-Parallel-SELECT-for-INSERT-INTO-.-SELECT-advanced-tests.patch application/octet-stream 45.3 KB
v22-0005-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch application/octet-stream 44.1 KB
v22-0006-Parallel-INSERT-and-or-SELECT-for-INSERT-INTO-tests-and-doc.patch application/octet-stream 22.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-03-05 13:14:11 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Pavel Stehule 2021-03-05 12:57:43 Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]