Re: Parallel Inserts in CREATE TABLE AS

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2020-12-25 01:41:51
Message-ID: CALDaNm1DG46RfhT7s_9_ZAztm72GcwxD0LrEsXq4MSdopLCB3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> On Thu, Dec 24, 2020 at 10:25 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy
> > <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy
> > > Attaching v14 patch set that has above changes. Please consider this
> > > for further review.
> > >
> >
> > Few comments:
> > In the below case, should create be above Gather?
> > postgres=# explain create table t7 as select * from t6;
> > QUERY PLAN
> > -------------------------------------------------------------------
> > Gather (cost=0.00..9.17 rows=0 width=4)
> > Workers Planned: 2
> > -> Create t7
> > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4)
> > (4 rows)
> >
> > Can we change it to something like:
> > -------------------------------------------------------------------
> > Create t7
> > -> Gather (cost=0.00..9.17 rows=0 width=4)
> > Workers Planned: 2
> > -> Parallel Seq Scan on t6 (cost=0.00..9.17 rows=417 width=4)
> > (4 rows)
> >
>
> I think it is better to have it in a way as in the current patch
> because that reflects that we are performing insert/create below
> Gather which is the purpose of this patch. I think this is similar to
> what the Parallel Insert patch [1] has for a similar plan.
>
>
> [1] - https://commitfest.postgresql.org/31/2844/
>

Also another thing that I felt was that actually the Gather nodes will
actually do the insert operation, the Create table will be done earlier
itself. Should we change Create table to Insert table something like below:
QUERY PLAN
-------------------------------------------------------------------
Gather (cost=0.00..9.17 rows=0 width=4)
Workers Planned: 2
-> *Insert table2 **(instead of Create table2)*
-> Parallel Seq Scan on table1 (cost=0.00..9.17 rows=417 width=4)

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tang, Haiying 2020-12-25 02:23:21 RE: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Michael Paquier 2020-12-25 01:04:36 Re: Feature request: Connection string parsing for postgres_fdw