Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-10 09:18:19
Message-ID: CAA4eK1+WBN99GOc8MwHfFoa7c-ssy+gYcbGVL3b8+A04pQKwWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 8, 2021 at 7:19 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> Hi Amit
>
> On Mon, Mar 8, 2021 at 10:18 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > On Mon, Mar 8, 2021 at 3:54 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > > I've attached an updated set of patches with the suggested locking changes.
>
> (Thanks Greg.)
>
> > Amit L, others, do let me know if you have still more comments on
> > 0001* patch or if you want to review it further?
>
> I just read through v25 and didn't find anything to complain about.
>

Thanks a lot, pushed now! Amit L., your inputs are valuable for this work.

Now, coming back to Hou-San's patch to introduce a GUC and reloption
for this feature, I think both of those make sense to me because when
the feature is enabled via GUC, one might want to disable it for
partitioned tables? Do we agree on that part or someone thinks
otherwise?

The other points to bikeshed could be:
1. The name of GUC and reloption. The two proposals at hand are
enable_parallel_dml and enable_parallel_insert. I would prefer the
second (enable_parallel_insert) because updates/deletes might not have
a similar overhead.

2. Should we keep the default value of GUC to on or off? It is
currently off. I am fine keeping it off for this release and we can
always turn it on in the later releases if required. Having said that,
I see the value in keeping it on because in many cases Insert ...
Select will be used for large data and there we will see a benefit of
parallelism and users facing trouble (who have a very large number of
partitions with less data to query) can still disable the parallel
insert for that particular table. Also, the other benefit of keeping
it on till at least the beta period is that this functionality will
get tested and if we found reports of regression then we can turn it
off for this release as well.

Thoughts?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2021-03-10 10:16:18 Re: Confusing behavior of psql's \e
Previous Message Laurenz Albe 2021-03-10 09:03:24 Re: Procedures versus the "fastpath" API