Re: Parallel Inserts in CREATE TABLE AS

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Inserts in CREATE TABLE AS
Date: 2020-11-26 09:15:20
Message-ID: CALj2ACVijBoAed8LBGqBN-Aiox8BtebPx5CfkRS-YaUTHrvbtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 26, 2020 at 12:15 PM Hou, Zhijie <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com> wrote:
>
> I took a deep look at the projection logic.
> In most cases, you are right that Gather node does not need projection.
>
> In some rare cases, such as Subplan (or initplan I guess).
> The projection will happen in Gather node.
>
> The example:
>
> Create table test(i int);
> Create table test2(a int, b int);
> insert into test values(generate_series(1,10000000,1));
> insert into test2 values(generate_series(1,1000,1), generate_series(1,1000,1));
>
> postgres=# explain(verbose, costs off) select test.i,(select i from (select * from test2) as tt limit 1) from test where test.i < 2000;
> QUERY PLAN
> ----------------------------------------
> Gather
> Output: test.i, (SubPlan 1)
> Workers Planned: 2
> -> Parallel Seq Scan on public.test
> Output: test.i
> Filter: (test.i < 2000)
> SubPlan 1
> -> Limit
> Output: (test.i)
> -> Seq Scan on public.test2
> Output: test.i
>
> In this case, projection is necessary,
> because the subplan will be executed in projection.
>
> If skipped, the table created will loss some data.
>

Thanks a lot for the use case. Yes with the current patch table will
lose data related to the subplan. On analyzing further, I think we can
not allow parallel inserts in the cases when the Gather node has some
projections to do. Because the workers can not perform that
projection. So, having ps_ProjInfo in the Gather node is an indication
for us to disable parallel inserts and only the leader can do the
insertions after the Gather node does the required projections.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2020-11-26 09:28:21 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Paul Förster 2020-11-26 09:11:00 configure and DocBook XML