Re: Parallel INSERT (INTO ... SELECT ...)

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2020-09-25 13:09:32
Message-ID: CALj2ACWcy3gud18R=-dcpxzBSGv_j-kfusxECsSP239ZGgHLsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 25, 2020 at 5:47 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> >
> > At least in the case of Parallel INSERT, the leader for the Parallel
> > INSERT gets a new xid (GetCurrentFullTransactionId) and it is passed
> > through and assigned to each of the workers during their
> > initialization (so they are assigned the same xid).
> >
>
> So are you facing problems in this area because we EnterParallelMode
> before even assigning the xid in the leader? Because I don't think we
> should ever reach this code in the worker. If so, there are two
> possibilities that come to my mind (a) assign xid in leader before
> entering parallel mode or (b) change the check so that we don't assign
> the new xid in workers. In this case, I am again wondering how does
> parallel copy dealing this?
>

In parallel copy, we are doing option (a) i.e. the leader gets the
full txn id before entering parallel mode and passes it to all
workers.
In the leader:
full_transaction_id = GetCurrentFullTransactionId();
EnterParallelMode();
shared_info_ptr->full_transaction_id = full_transaction_id;
In the workers:
AssignFullTransactionIdForWorker(pcshared_info->full_transaction_id);

Hence below part of the code doesn't get hit.
if (IsInParallelMode() || IsParallelWorker())
elog(ERROR, "cannot assign XIDs during a parallel operation");

We also deal with the commandid similarly i.e. the leader gets the
command id, and workers would use it while insertion.
In the leader:
shared_info_ptr->mycid = GetCurrentCommandId(true);
In the workers:
AssignCommandIdForWorker(pcshared_info->mycid, true);

[1]
void
AssignFullTransactionIdForWorker(FullTransactionId fullTransactionId)
{
TransactionState s = CurrentTransactionState;

Assert((IsInParallelMode() || IsParallelWorker()));
s->fullTransactionId = fullTransactionId;
}

void
AssignCommandIdForWorker(CommandId commandId, bool used)
{
Assert((IsInParallelMode() || IsParallelWorker()));

/* this is global to a transaction, not subtransaction-local */
if (used)
currentCommandIdUsed = true;

currentCommandId = commandId;
}

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-09-25 13:51:37 Re: Dumping/restoring fails on inherited generated column
Previous Message Peter Eisentraut 2020-09-25 13:07:53 Re: Dumping/restoring fails on inherited generated column