Re: Parallel Inserts in CREATE TABLE AS

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2020-12-30 11:56:05
Message-ID: CALDaNm15EjnP9nDY-LM-Sci_4YxbCo_MQ7BprR48awbOUM8Uag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 30, 2020 at 10:47 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Wed, Dec 30, 2020 at 10:32 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > I have completed reviewing 0001, I don't have more comments, just one
> > question. Soon I will review the remaining patches.
>
> Thanks.
>
> > + /* If parallel inserts are to be allowed, set a few extra information. */
> > + if (myState->is_parallel)
> > + {
> > + myState->object_id = intoRelationAddr.objectId;
> > +
> > + /*
> > + * We don't need to skip contacting FSM while inserting tuples for
> > + * parallel mode, while extending the relations, workers instead of
> > + * blocking on a page while another worker is inserting, can check the
> > + * FSM for another page that can accommodate the tuples. This results
> > + * in major benefit for parallel inserts.
> > + */
> > + myState->ti_options = 0;
> >
> > Is there any performance data for this or just theoretical analysis?
>
> I have seen that we don't get much performance with the skip fsm
> option, though I don't have the data to back it up. I'm planning to
> run performance tests after the patches 0001, 0002 and 0003 get
> reviewed. I will capture the data at that time. Hope that's fine.
>

When you run the performance tests, you can try to capture and publish
relation size & the number of pages that are getting created for base
table and the CTAS table, you can use something like SELECT relpages
FROM pg_class WHERE relname = 'tablename & SELECT
pg_total_relation_size('tablename'). Just to make sure that there is
no significant difference between the base table and CTAS table.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-12-30 11:58:16 Re: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Alexey Kondratov 2020-12-30 11:50:23 Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit