Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-12 13:17:22
Message-ID: CA+HiwqHtR8v0K=2LSvbmtQaB4v8SsMvTV2qD5WsqkfW+aXHQbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 11, 2021 at 4:43 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> On Thu, Feb 11, 2021 at 5:33 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > On Tue, Feb 9, 2021 at 1:04 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > >
> > > * I think that the concerns raised by Tsunakawa-san in:
> > >
> > > https://www.postgresql.org/message-id/TYAPR01MB2990CCB6E24B10D35D28B949FEA30%40TYAPR01MB2990.jpnprd01.prod.outlook.com
> > >
> > > regarding how this interacts with plancache.c deserve a look.
> > > Specifically, a plan that uses parallel insert may fail to be
> > > invalidated when partitions are altered directly (that is without
> > > altering their root parent). That would be because we are not adding
> > > partition OIDs to PlannerGlobal.invalItems despite making a plan
> > > that's based on checking their properties. See this (tested with all
> > > patches applied!):
> > >
> >
> > Does any current Postgres code add partition OIDs to
> > PlannerGlobal.invalItems for a similar reason?

Currently, the planner opens partitions only for SELECT queries and
also adds them to the query's range table. And because they are added
to the range table, their OIDs do get added to
PlannerGlobal.relationOids (not invalItems, sorry!) by way of
CompleteCachedPlan() calling extract_query_dependencies(), which looks
at Query.rtable to decide which tables/partitions to add.

> > I would have thought that, for example, partitions with a default
> > column expression, using a function that is changed from SAFE to
> > UNSAFE, would suffer the same plancache issue (for current parallel
> > SELECT functionality) as we're talking about here - but so far I
> > haven't seen any code handling this.

AFAIK, default column expressions don't affect plans for SELECT
queries. OTOH, consider a check constraint expression as an example.
The planner may use one to exclude a partition from the plan with its
constraint exclusion algorithm (separate from "partition pruning").
If the check constraint is dropped, any cached plans that used it will
be invalidated.

create table rp (a int) partition by range (a);
create table rp1 partition of rp for values from (minvalue) to (0);
create table rp2 partition of rp for values from (0) to (maxvalue);
alter table rp1 add constraint chk check (a >= -5);
set constraint_exclusion to on;

-- forces using a cached plan
set plan_cache_mode to force_generic_plan ;
prepare q as select * from rp where a < -5;

-- planner excluded rp1 because of the contradictory constraint
explain execute q;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)

-- constraint dropped, plancache inval hook invoked
alter table rp1 drop constraint chk ;

-- old plan invalidated, new one made
explain execute q;
QUERY PLAN
---------------------------------------------------------
Seq Scan on rp1 rp (cost=0.00..41.88 rows=850 width=4)
Filter: (a < '-5'::integer)
(2 rows)

> > (Currently invalItems seems to support PROCID and TYPEOID; relation
> > OIDs seem to be handled through a different mechanism)..
>
> > Can you elaborate on what you believe is required here, so that the
> > partition OID dependency is registered and the altered partition
> > results in the plan being invalidated?
> > Thanks in advance for any help you can provide here.
>
> Actually, I tried adding the following in the loop that checks the
> parallel-safety of each partition and it seemed to work:
>
> glob->relationOids =
> lappend_oid(glob->relationOids, pdesc->oids[i]);
>
> Can you confirm, is that what you were referring to?

Right. I had mistakenly mentioned PlannerGlobal.invalItems, sorry.

Although it gets the job done, I'm not sure if manipulating
relationOids from max_parallel_hazard() or its subroutines is okay,
but I will let the committer decide that. As I mentioned above, the
person who designed this decided for some reason that it is
extract_query_dependencies()'s job to populate
PlannerGlobal.relationOids/invalItems.

> (note that I've already updated the code to use
> CreatePartitionDirectory() and PartitionDirectoryLookup())

I will check your v16 to check if that indeed does the intended thing.

--
Amit Langote
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-02-12 13:43:38 Why do we have MakeSingleTupleTableSlot instead of not using MakeTupleTableSlot?
Previous Message Philippe Beaudoin 2021-02-12 13:15:42 Re: PostgreSQL and Flashback Database