Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-18 05:35:06
Message-ID: CA+HiwqEED-_FEKzXxC26qY19mMG6_+hmywxZYY9_AGNjbwbjmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 18, 2021 at 10:03 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> On Thu, Feb 18, 2021 at 12:34 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > All that is to say that we should move our code to add partition OIDs
> > as plan dependencies to somewhere in set_plan_references(), which
> > otherwise populates PlannedStmt.relationOids. I updated the patch to
> > do that.
>
> OK, understood. Thanks for the detailed explanation.
>
> > It also occurred to me that we can avoid pointless adding of
> > partitions if the final plan won't use parallelism. For that, the
> > patch adds checking glob->parallelModeNeeded, which seems to do the
> > trick though I don't know if that's the correct way of doing that.
> >
>
> I'm not sure if's pointless adding partitions even in the case of NOT
> using parallelism, because we may be relying on the result of
> parallel-safety checks on partitions in both cases.
> For example, insert_parallel.sql currently includes a test (that you
> originally provided in a previous post) that checks a non-parallel
> plan is generated after a parallel-unsafe trigger is created on a
> partition involved in the INSERT.
> If I further add to that test by then dropping that trigger and then
> again using EXPLAIN to see what plan is generated, then I'd expect a
> parallel-plan to be generated, but with the setrefs-v3.patch it still
> generates a non-parallel plan. So I think the "&&
> glob->parallelModeNeeded" part of test needs to be removed.

Ah, okay, I didn't retest my case after making that change.

Looking at this again, I am a bit concerned about going over the whole
partition tree *twice* when making a parallel plan for insert into
partitioned tables. Maybe we should do what you did in your first
attempt a slightly differently -- add partition OIDs during the
max_parallel_hazard() initiated scan of the partition tree as you did.
Instead of adding them directly to PlannerGlobal.relationOids, add to,
say, PlannerInfo.targetPartitionOids and have set_plan_references() do
list_concat(glob->relationOids, list_copy(root->targetPartitionOids)
in the same place as setrefs-v3 does
add_target_partition_oids_recurse(). Thoughts?

--
Amit Langote
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-02-18 06:02:22 Re: A qsort template
Previous Message Michael Paquier 2021-02-18 05:17:00 Re: progress reporting for partitioned REINDEX