RE: Parallel Inserts in CREATE TABLE AS

From: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Parallel Inserts in CREATE TABLE AS
Date: 2020-11-24 11:13:28
Message-ID: cba3db30f802466ba257a26515cd75ed@G08CNEXMBPEKD05.g08.fujitsu.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'm very interested in this feature,
and I'm looking at the patch, here are some comments.

1.
+ if (!TupIsNull(outerTupleSlot))
+ {
+ (void) node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest);
+ node->ps.state->es_processed++;
+ }
+
+ if(TupIsNull(outerTupleSlot))
+ break;
+ }

How about the following style:

if(TupIsNull(outerTupleSlot))
Break;

(void) node->ps.dest->receiveSlot(outerTupleSlot, node->ps.dest);
node->ps.state->es_processed++;

Which looks cleaner.

2.
+
+ if (into != NULL &&
+ IsA(into, IntoClause))
+ {

The check can be replaced by ISCTAS(into).

3.
+ /*
+ * For parallelizing inserts in CTAS i.e. making each
+ * parallel worker inerst it's tuples, we must send
+ * information such as intoclause(for each worker

'inerst' looks like a typo (insert).

4.
+ /* Estimate space for into clause for CTAS. */
+ if (ISCTAS(planstate->intoclause))
+ {
+ intoclausestr = nodeToString(planstate->intoclause);
+ shm_toc_estimate_chunk(&pcxt->estimator, strlen(intoclausestr) + 1);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
...
+ if (intoclausestr != NULL)
+ {
+ char *shmptr = (char *)shm_toc_allocate(pcxt->toc,
+ strlen(intoclausestr) + 1);
+ strcpy(shmptr, intoclausestr);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr);
+ }

The code here call strlen(intoclausestr) for two times,
After checking the existing code in ExecInitParallelPlan,
It used to store the strlen in a variable.

So how about the following style:

intoclause_len = strlen(intoclausestr);
...
/* Store serialized intoclause. */
intoclause_space = shm_toc_allocate(pcxt->toc, intoclause_len + 1);
memcpy(shmptr, intoclausestr, intoclause_len + 1);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, intoclause_space);

the code in ExecInitParallelPlan

5.
+ if (intoclausestr != NULL)
+ {
+ char *shmptr = (char *)shm_toc_allocate(pcxt->toc,
+ strlen(intoclausestr) + 1);
+ strcpy(shmptr, intoclausestr);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_INTO_CLAUSE, shmptr);
+ }
+
/* Set up the tuple queues that the workers will write into. */
- pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);
+ if (intoclausestr == NULL)
+ pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);

The two check about intoclausestr seems can be combined like:

if (intoclausestr != NULL)
{
...
}
else
{
...
}

Best regards,
houzj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Li Japin 2020-11-24 11:16:20 Remove cache_plan argument comments to ri_PlanCheck
Previous Message Michael Paquier 2020-11-24 10:52:08 Re: scram-sha-256 broken with FIPS and OpenSSL 1.0.2