Re: Determine parallel-safety of partition relations for Inserts

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Determine parallel-safety of partition relations for Inserts
Date: 2021-01-17 11:15:29
Message-ID: CA+HiwqEmzHgyfn6XKMsp9RLFeOfOq-t-e9Or7vs1XnZFPGq31w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 16, 2021 at 2:02 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Jan 15, 2021 at 6:45 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Fri, Jan 15, 2021 at 9:59 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > We want to do this for Inserts where only Select can be parallel and
> > > Inserts will always be done by the leader backend. This is actually
> > > the case we first want to implement.
> >
> > Sorry, I haven't looked at the linked threads and the latest patches
> > there closely enough yet, so I may be misreading this, but if the
> > inserts will always be done by the leader backend as you say, then why
> > does the planner need to be checking the parallel safety of the
> > *target* table's expressions?
> >
>
> The reason is that once we enter parallel-mode we can't allow
> parallel-unsafe things (like allocation of new CIDs, XIDs, etc.). We
> enter the parallel-mode at the beginning of the statement execution,
> see ExecutePlan(). So, the Insert will be performed in parallel-mode
> even though it happens in the leader backend. It is not possible that
> we finish getting all the tuples from the gather node first and then
> start inserting. Even, if we somehow find something to make this work
> anyway the checks being discussed will be required to make inserts
> parallel (where inserts will be performed by workers) which is
> actually the next patch in the thread I mentioned in the previous
> email.
>
> Does this answer your question?

Yes, thanks for the explanation. I kind of figured that doing the
insert part itself in parallel using workers would be a part of the
end goal of this work, although that didn't come across immediately.

It's a bit unfortunate that the parallel safety check of the
individual partitions cannot be deferred until it's known that a given
partition will be affected by the command at all. Will we need
fundamental changes to how parallel query works to make that possible?
If so, have such options been considered in these projects? If such
changes are not possible in the short term, like for v14, we should at
least try to make sure that the eager checking of all partitions is
only performed if using parallelism is possible at all.

I will try to take a look at the patches themselves to see if there's
something I know that will help.

--
Amit Langote
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2021-01-17 12:51:35 Re: Pg14, pg_dumpall and "password_encryption=true"
Previous Message Vik Fearing 2021-01-17 11:00:06 Re: Dump public schema ownership & seclabels