Re: Parallel INSERT (INTO ... SELECT ...)

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2020-10-05 10:56:22
Message-ID: CALj2ACWU9vXO+h=H8kJzykChgZHGObrgvxwtEV+dxJ1skx8pJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 30, 2020 at 7:38 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> > >
> > > I think you still need to work on the costing part, basically if we
> > > are parallelizing whole insert then plan is like below
> > >
> > > -> Gather
> > > -> Parallel Insert
> > > -> Parallel Seq Scan
> > >
> > > That means the tuple we are selecting via scan are not sent back to
> > > the gather node, so in cost_gather we need to see if it is for the
> > > INSERT then there is no row transferred through the parallel queue
> > > that mean we need not to pay any parallel tuple cost.
> >
> > I just looked into the parallel CTAS[1] patch for the same thing, and
> > I can see in that patch it is being handled.
> >
> > [1] https://www.postgresql.org/message-id/CALj2ACWFq6Z4_jd9RPByURB8-Y8wccQWzLf%2B0-Jg%2BKYT7ZO-Ug%40mail.gmail.com
> >
>
> Hi Dilip,
>
> You're right, the costing for Parallel Insert is not done and
> finished, I'm still working on the costing, and haven't posted an
> updated patch for it yet.
> As far as cost_gather() method is concerned, for Parallel INSERT, it
> can probably use the same costing approach as the CTAS patch except in
> the case of a specified RETURNING clause.
>

I have one question which is common to both this patch and parallel
inserts in CTAS[1], do we need to skip creating tuple
queues(ExecParallelSetupTupleQueues) as we don't have any tuples
that's being shared from workers to leader? Put it another way, do we
use the tuple queue for sharing any info other than tuples from
workers to leader?

[1] https://www.postgresql.org/message-id/CALj2ACWFq6Z4_jd9RPByURB8-Y8wccQWzLf%2B0-Jg%2BKYT7ZO-Ug%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-10-05 11:15:04 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Dave Cramer 2020-10-05 10:54:23 Fwd: Support for OUT parameters in procedures