Re: Parallel Inserts in CREATE TABLE AS

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2021-01-06 06:43:24
Message-ID: CALj2ACW+5RK+rLrcH_V1KQmkkaiKECEodr8o9Fp6NF8z+3282A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 6, 2021 at 11:30 AM Hou, Zhijie <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com> wrote:
>
> > > I think it makes sense.
> > >
> > > And if the check about ' ins_cmd == xxx1 || ins_cmd == xxx2' may be
> > > used in some places, How about define a generic function with some comment
> > to mention the purpose.
> > >
> > > An example in INSERT INTO SELECT patch:
> > > +/*
> > > + * IsModifySupportedInParallelMode
> > > + *
> > > + * Indicates whether execution of the specified table-modification
> > > +command
> > > + * (INSERT/UPDATE/DELETE) in parallel-mode is supported, subject to
> > > +certain
> > > + * parallel-safety conditions.
> > > + */
> > > +static inline bool
> > > +IsModifySupportedInParallelMode(CmdType commandType) {
> > > + /* Currently only INSERT is supported */
> > > + return (commandType == CMD_INSERT); }
> >
> > The intention of assert is to verify that those functions are called for
> > appropriate commands such as CTAS, Refresh Mat View and so on with correct
> > parameters. I really don't think so we can replace the assert with a function
> > like above, in the release mode assertion will always be true. In a way,
> > that assertion is for only debugging purposes. And I also think that when
> > we as the callers know when to call those new functions, we can even remove
> > the assertions, if they are really a problem here. Thoughts?
> Hi
>
> Thanks for the explanation.
>
> If the check about command type is only used in assert, I think you are right.
> I suggested a new function because I guess the check can be used in some other places.
> Such as:
>
> + /* Okay to parallelize inserts, so mark it. */
> + if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> + ((DR_intorel *) dest)->is_parallel = true;
>
> + if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> + ((DR_intorel *) dest)->is_parallel = false;

We need to know exactly what is the command in above place, to
dereference and mark is_parallel to true, because is_parallel is being
added to the respective structures, not to the generic _DestReceiver
structure. So, in future the above code becomes something like below:

+ /* Okay to parallelize inserts, so mark it. */
+ if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
+ ((DR_intorel *) dest)->is_parallel = true;
+ else if (ins_cmd == PARALLEL_INSERT_CMD_REFRESH_MAT_VIEW)
+ ((DR_transientrel *) dest)->is_parallel = true;
+ else if (ins_cmd == PARALLEL_INSERT_CMD_COPY_TO)
+ ((DR_copy *) dest)->is_parallel = true;

In the below place, instead of new function, I think we can just have
something like if (fpes->ins_cmd_type != PARALLEL_INSERT_CMD_UNDEF)

> Or
>
> + if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> + pg_atomic_add_fetch_u64(&fpes->processed, queryDesc->estate->es_processed);
>
> If you think the above code will extend the ins_cmd type check in the future, the generic function may make sense.

We can also change below to fpes->ins_cmd_type != PARALLEL_INSERT_CMD_UNDEF.

+ if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
+ receiver = ExecParallelGetInsReceiver(toc, fpes);

If okay, I will modify it in the next version of the patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2021-01-06 06:43:32 When (and whether) should we improve the chapter on parallel query to accommodate parallel data updates?
Previous Message Shinya11.Kato 2021-01-06 06:36:05 RE: [PATCH] Feature improvement for CLOSE, FETCH, MOVE tab completion