Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2020-09-26 05:16:51
Message-ID: CAA4eK1+E-pM0U6qw7EOF0yO0giTxdErxoJV9xTqN+Lo9zdotFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 25, 2020 at 9:23 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Fri, Sep 25, 2020 at 10:17 PM Amit Kapila <amit(dot)kapila16(at)g
>
> As it turns out, I think I have solved the commandId issue (and almost
> the xid issue) by realising that both the xid and cid are ALREADY
> being included as part of the serialized transaction state in the
> Parallel DSM. So actually I don't believe that there is any need for
> separately passing them in the DSM, and having to use those
> AssignXXXXForWorker() functions in the worker code - not even in the
> Parallel Copy case (? - need to check). GetCurrentCommandId(true) and
> GetFullTransactionId() need to be called prior to Parallel DSM
> initialization, so they are included in the serialized transaction
> state.
> I just needed to add a function to set currentCommandIdUsed=true in
> the worker initialization (for INSERT case) and make a small tweak to
> the Assert in GetCurrentCommandId() to ensure that
> currentCommandIdUsed, in a parallel worker, never gets set to true
> when it is false. This is in line with the comment in that function,
> because we know that "currentCommandId was already true at the start
> of the parallel operation". With this in place, I don't need to change
> any of the original calls to GetCurrentCommandId(), so this addresses
> that issue raised by Andres.
>
> I am not sure yet how to get past the issue of the parallel mode being
> set at the top of ExecutePlan(). With that in place, it doesn't allow
> a xid to be acquired for the leader, without removing/changing that
> parallel-mode check in GetNewTransactionId().
>

I think now there is no fundamental problem in allocating xid in the
leader and then sharing it with workers who can use it to perform the
insert. So we can probably tweak that check so that it is true for
only parallel workers.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-09-26 05:29:44 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Tom Lane 2020-09-26 05:03:37 Re: a potential size overflow issue