Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2020-12-09 10:34:47
Message-ID: CAA4eK1JqPNgnud75NeyfPoO1nug-K2B-UBRH=_K0w_75KqzZPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 9, 2020 at 2:38 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Wed, Dec 9, 2020 at 10:11 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> >
> > On Wed, Dec 9, 2020 at 1:35 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > Most of the code present in
> > > v9-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch is
> > > applicable for parallel copy patch also. The patch in this thread
> > > handles the check for PROPARALLEL_UNSAFE, we could slightly make it
> > > generic by handling like the comments below, that way this parallel
> > > safety checks can be used based on the value set in
> > > max_parallel_hazard_context. There is nothing wrong with the changes,
> > > I'm providing these comments so that this patch can be generalized for
> > > parallel checks and the same can also be used by parallel copy.
> >
> > Hi Vignesh,
> >
> > You are absolutely right in pointing that out, the code was taking
> > short-cuts knowing that for Parallel Insert,
> > "max_parallel_hazard_context.max_interesting" had been set to
> > PROPARALLEL_UNSAFE, which doesn't allow that code to be generically
> > re-used by other callers.
> >
> > I've attached a new set of patches that includes your suggested improvements.
>
> I was going through v10-0001 patch where we are parallelizing only the
> select part.
>
> + /*
> + * UPDATE is not currently supported in parallel-mode, so prohibit
> + * INSERT...ON CONFLICT...DO UPDATE...
> + */
> + if (parse->onConflict != NULL && parse->onConflict->action ==
> ONCONFLICT_UPDATE)
> + return PROPARALLEL_UNSAFE;
>
> I understand that we can now allow updates from the worker, but what
> is the problem if we allow the parallel select even if there is an
> update in the leader?
>

I think we can't allow update even in leader without having a
mechanism for a shared combocid table. Right now, we share the
ComboCids at the beginning of the parallel query and then never change
it during the parallel query but if we allow updates in the leader
backend which can generate a combocid then we need a mechanism to
propagate that change. Does this make sense?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Seino Yuki 2020-12-09 10:37:57 Re: Feature improvement for pg_stat_statements
Previous Message Amit Kapila 2020-12-09 10:00:37 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions