Re: Parallel Inserts in CREATE TABLE AS

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2020-12-24 07:37:28
Message-ID: CALj2ACW0o8Dw9iHMwtnqVLhycUPTPhpPM7ChfbGbRdJKJUZpMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 24, 2020 at 10:25 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> You could change intoclause_len = strlen(intoclausestr) to
> strlen(intoclausestr) + 1 and use intoclause_len in the remaining
> places. We can avoid the +1 in the other places.
> + /* Estimate space for into clause for CTAS. */
> + if (IS_CTAS(intoclause) && OidIsValid(objectid))
> + {
> + intoclausestr = nodeToString(intoclause);
> + intoclause_len = strlen(intoclausestr);
> + shm_toc_estimate_chunk(&pcxt->estimator, intoclause_len + 1);
> + shm_toc_estimate_keys(&pcxt->estimator, 1);
> + }

Done.

> Can we use node->nworkers_launched == 0 in place of
> node->need_to_scan_locally, that way the setting and resetting of
> node->need_to_scan_locally can be removed. Unless need_to_scan_locally
> is needed in any of the functions that gets called.
> + /* Enable leader to insert in case no parallel workers were launched. */
> + if (node->nworkers_launched == 0)
> + node->need_to_scan_locally = true;
> +
> + /*
> + * By now, for parallel workers (if launched any), would have
> started their
> + * work i.e. insertion to target table. In case the leader is chosen to
> + * participate for parallel inserts in CTAS, then finish its
> share before
> + * going to wait for the parallel workers to finish.
> + */
> + if (node->need_to_scan_locally)
> + {

need_to_scan_locally is being set in ExecGather() even if
nworkers_launched > 0 it can still be true, so I think we can not
remove need_to_scan_locally in ExecParallelInsertInCTAS.

Attaching v15 patch set for further review. Note that the change is
only in 0001 patch, other patches remain unchanged from v14.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v15-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch application/octet-stream 33.7 KB
v15-0002-Tuple-Cost-Adjustment-for-Parallel-Inserts-in-CTAS.patch application/octet-stream 12.6 KB
v15-0003-Tests-For-Parallel-Inserts-in-CTAS.patch application/octet-stream 27.9 KB
v15-0004-Enable-CTAS-Parallel-Inserts-For-Append.patch application/octet-stream 44.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-12-24 08:02:20 Re: In-placre persistance change of a relation
Previous Message Amit Kapila 2020-12-24 07:00:57 Re: Movement of restart_lsn position movement of logical replication slots is very slow