Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2020-10-16 10:26:51
Message-ID: CAA4eK1+VG8ZXuTSe8w=TTFioZK-538wLMaKh8P3Vn-QGSXppqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 16, 2020 at 2:16 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Fri, Oct 16, 2020 at 3:43 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
>
> > Also, I noticed that you have allowed for
> > parallelism only when all expressions/functions involved with Insert
> > are parallel-safe, can't we allow parallel-restricted case because
> > anyway Inserts have to be performed by the leader for this patch.
> >
>
> Yes, I think you're right.
> "A parallel restricted operation is one which cannot be performed in a
> parallel worker, but which can be performed in the leader while
> parallel query is in use."
> I'll make the change and test that everything works OK.
>

Cool, let me try to explain my thoughts a bit more. The idea is first
(in standard_planner) we check if there is any 'parallel_unsafe'
function/expression (via max_parallel_hazard) in the query tree. If we
don't find anything 'parallel_unsafe' then we mark parallelModeOk. At
this stage, the patch is checking whether there is any
'parallel_unsafe' or 'parallel_restricted' expression/function in the
target relation and if there is none then we mark parallelModeOK as
true. So, if there is anything 'parallel_restricted' then we will mark
parallelModeOK as false which doesn't seem right to me.

Then later in the planner during set_rel_consider_parallel, we
determine if a particular relation can be scanned from within a
worker, then we consider that relation for parallelism. Here, we
determine if certain things are parallel-restricted then we don't
consider this for parallelism. Then we create partial paths for the
relations that are considered for parallelism. I think we don't need
to change anything for the current patch in these later stages because
we anyway are not considering Insert to be pushed into workers.
However, in the second patch where we are thinking to push Inserts in
workers, we might need to do something to filter parallel-restricted
cases during this stage of the planner.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2020-10-16 10:30:34 Re: Implementing Incremental View Maintenance
Previous Message movead.li@highgo.ca 2020-10-16 09:39:58 Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.