Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-17 01:44:15
Message-ID: CAJcOf-c2TrQFdhBgHMLs7HmzG1iLAUc3vnDmzYkmxek8Zjhfjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 17, 2021 at 12:19 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> On Mon, Feb 15, 2021 at 4:39 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > On Sat, Feb 13, 2021 at 12:17 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > On Thu, Feb 11, 2021 at 4:43 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > > > Actually, I tried adding the following in the loop that checks the
> > > > parallel-safety of each partition and it seemed to work:
> > > >
> > > > glob->relationOids =
> > > > lappend_oid(glob->relationOids, pdesc->oids[i]);
> > > >
> > > > Can you confirm, is that what you were referring to?
> > >
> > > Right. I had mistakenly mentioned PlannerGlobal.invalItems, sorry.
> > >
> > > Although it gets the job done, I'm not sure if manipulating
> > > relationOids from max_parallel_hazard() or its subroutines is okay,
> > > but I will let the committer decide that. As I mentioned above, the
> > > person who designed this decided for some reason that it is
> > > extract_query_dependencies()'s job to populate
> > > PlannerGlobal.relationOids/invalItems.
> >
> > Yes, it doesn't really seem right doing it within max_parallel_hazard().
> > I tried doing it in extract_query_dependencies() instead - see
> > attached patch - and it seems to work, but I'm not sure if there might
> > be any unintended side-effects.
>
> One issue I see with the patch is that it fails to consider
> multi-level partitioning, because it's looking up partitions only in
> the target table's PartitionDesc and no other.
>
> @@ -3060,8 +3066,36 @@ extract_query_dependencies_walker(Node *node,
> PlannerInfo *context)
> RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
>
> if (rte->rtekind == RTE_RELATION)
> - context->glob->relationOids =
> - lappend_oid(context->glob->relationOids, rte->relid);
> + {
> + PlannerGlobal *glob;
> +
> + glob = context->glob;
> + glob->relationOids =
> + lappend_oid(glob->relationOids, rte->relid);
> + if (query->commandType == CMD_INSERT &&
> + rte->relkind == RELKIND_PARTITIONED_TABLE)
>
> The RTE whose relkind is being checked here may not be the INSERT
> target relation's RTE, even though that's perhaps always true today.
> So, I suggest to pull the new block out of the loop over rtable and
> perform its deeds on the result RTE explicitly fetched using
> rt_fetch(), preferably using a separate recursive function. I'm
> thinking something like the attached revised version.
>
>

Thanks. Yes, I'd forgotten about the fact a partition may itself be
partitioned, so it needs to be recursive (like in the parallel-safety
checks).
Your revised version seems OK, though I do have a concern:
Is the use of "table_close(rel, NoLock)'' intentional? That will keep
the lock (lockmode) until end-of-transaction.

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-02-17 02:08:36 Re: pg_collation_actual_version() ERROR: cache lookup failed for collation 123
Previous Message Andy Fan 2021-02-17 01:07:50 Re: How to get Relation tuples in C function