Re: Parallel INSERT (INTO ... SELECT ...)

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-10 14:00:59
Message-ID: CAFiTN-uGNpBVv7D9_o+V40H9_Q6p7Br6GjeqzXXQA7xh_O+8HQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 10, 2021 at 2:48 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Mar 8, 2021 at 7:19 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> >
> > Hi Amit
> >
> > On Mon, Mar 8, 2021 at 10:18 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > On Mon, Mar 8, 2021 at 3:54 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > > > I've attached an updated set of patches with the suggested locking changes.
> >
> > (Thanks Greg.)
> >
> > > Amit L, others, do let me know if you have still more comments on
> > > 0001* patch or if you want to review it further?
> >
> > I just read through v25 and didn't find anything to complain about.
> >
>
> Thanks a lot, pushed now! Amit L., your inputs are valuable for this work.
>
> Now, coming back to Hou-San's patch to introduce a GUC and reloption
> for this feature, I think both of those make sense to me because when
> the feature is enabled via GUC, one might want to disable it for
> partitioned tables? Do we agree on that part or someone thinks
> otherwise?

I think it makes sense to have both.

>
> The other points to bikeshed could be:
> 1. The name of GUC and reloption. The two proposals at hand are
> enable_parallel_dml and enable_parallel_insert. I would prefer the
> second (enable_parallel_insert) because updates/deletes might not have
> a similar overhead.

I agree enable_parallel_insert makes more sense.

> 2. Should we keep the default value of GUC to on or off? It is
> currently off. I am fine keeping it off for this release and we can
> always turn it on in the later releases if required. Having said that,
> I see the value in keeping it on because in many cases Insert ...
> Select will be used for large data and there we will see a benefit of
> parallelism and users facing trouble (who have a very large number of
> partitions with less data to query) can still disable the parallel
> insert for that particular table. Also, the other benefit of keeping
> it on till at least the beta period is that this functionality will
> get tested and if we found reports of regression then we can turn it
> off for this release as well.
>
> Thoughts?

IMHO, we should keep it on because in most of the cases it is going
the give benefit to the user, and if for some specific scenario where
a table has a lot of partition then it can be turned off using
reloption. And, if for some users most of the tables are like that
where they always have a lot of partition then the user can turn it
off using guc.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2021-03-10 14:01:05 Re: Lowering the ever-growing heap->pd_lower
Previous Message David Steele 2021-03-10 13:52:50 Re: proposal: unescape_text function