Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-19 10:38:25
Message-ID: CAA4eK1JZFtZuVuXpNO+LONvyXCXPgPHKBxMu47T615yc0Yz-HA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 18, 2021 at 11:05 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> > > It also occurred to me that we can avoid pointless adding of
> > > partitions if the final plan won't use parallelism. For that, the
> > > patch adds checking glob->parallelModeNeeded, which seems to do the
> > > trick though I don't know if that's the correct way of doing that.
> > >
> >
> > I'm not sure if's pointless adding partitions even in the case of NOT
> > using parallelism, because we may be relying on the result of
> > parallel-safety checks on partitions in both cases.
> > For example, insert_parallel.sql currently includes a test (that you
> > originally provided in a previous post) that checks a non-parallel
> > plan is generated after a parallel-unsafe trigger is created on a
> > partition involved in the INSERT.
> > If I further add to that test by then dropping that trigger and then
> > again using EXPLAIN to see what plan is generated, then I'd expect a
> > parallel-plan to be generated, but with the setrefs-v3.patch it still
> > generates a non-parallel plan. So I think the "&&
> > glob->parallelModeNeeded" part of test needs to be removed.
>
> Ah, okay, I didn't retest my case after making that change.
>

Greg has point here but I feel something on previous lines (having a
test of glob->parallelModeNeeded) is better. We only want to
invalidate the plan if the prepared plan is unsafe to execute next
time. It is quite possible that there are unsafe triggers on different
partitions and only one of them is dropped, so next time planning will
again yield to the same non-parallel plan. If we agree with that I
think it is better to add this dependency in set_plan_refs (along with
Gather node handling).

Also, if we agree that we don't have any cheap way to determine
parallel-safety of partitioned relations then shall we consider the
patch being discussed [1] to be combined here?

Amit L, do you agree with that line of thought, or you have any other ideas?

I feel we should focus on getting the first patch of Greg
(v18-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT, along with
a test case patch) and Hou-San's patch discussed at [1] ready. Then we
can focus on the
v18-0003-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO. Because
even if we get the first patch that is good enough for some users.

What do you think?

[1] - https://www.postgresql.org/message-id/CAA4eK1K-cW7svLC2D7DHoGHxdAdg3P37BLgebqBOC2ZLc9a6QQ%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-02-19 11:08:44 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Andy Fan 2021-02-19 10:03:10 Re: Extend more usecase for planning time partition pruning and init partition pruning.