Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2020-09-26 05:41:52
Message-ID: CAA4eK1LgUnk9X5yvYnwwoueijB-uuGPECVEAHVPmxKLoHW+xqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 26, 2020 at 11:00 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Fri, Sep 25, 2020 at 9:23 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> >
> > On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> >
> > Again, there's a fundamental difference in the Parallel Insert case.
> > Right at the top of ExecutePlan it calls EnterParallelMode().
> > For ParallelCopy(), there is no such problem. EnterParallelMode() is
> > only called just before ParallelCopyMain() is called. So it can easily
> > acquire the xid before this, because then parallel mode is not set.
> >
> > As it turns out, I think I have solved the commandId issue (and almost
> > the xid issue) by realising that both the xid and cid are ALREADY
> > being included as part of the serialized transaction state in the
> > Parallel DSM. So actually I don't believe that there is any need for
> > separately passing them in the DSM, and having to use those
> > AssignXXXXForWorker() functions in the worker code - not even in the
> > Parallel Copy case (? - need to check).
> >
>
> Thanks Gred for the detailed points.
>
> I further checked on full txn id and command id. Yes, these are
> getting passed to workers via InitializeParallelDSM() ->
> SerializeTransactionState(). I tried to summarize what we need to do
> in case of parallel inserts in general i.e. parallel COPY, parallel
> inserts in INSERT INTO and parallel inserts in CTAS.
>
> In the leader:
> GetCurrentFullTransactionId()
> GetCurrentCommandId(true)
> EnterParallelMode();
> InitializeParallelDSM() --> calls SerializeTransactionState()
> (both full txn id and command id are serialized into parallel DSM)
>

This won't be true for Parallel Insert patch as explained by Greg as
well because we enter-parallel-mode much before we assign xid.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-09-26 05:51:37 Re: Dynamic gathering the values for seq_page_cost/xxx_cost
Previous Message Amit Kapila 2020-09-26 05:32:32 Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers