Re: parallel.sgml for Gather with InitPlans

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel.sgml for Gather with InitPlans
Date: 2018-05-08 03:34:16
Message-ID: CAA4eK1JFN-F8PjYCmqkXj3j6=BqSRLfSA94Nfy3kRr4hA86-oQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 7, 2018 at 11:07 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> In the wake of commit e89a71fb449af2ef74f47be1175f99956cf21524,
> parallel.sgml is no longer correct about the effect of InitPlans:
>
> <para>
> The following operations are always parallel restricted.
> </para>
>
> ...
>
> <para>
> Access to an <literal>InitPlan</literal> or correlated
> <literal>SubPlan</literal>.
> </para>
>
> I thought about this a bit and came up with the attached patch.
>

- Access to an <literal>InitPlan</literal> or correlated
<literal>SubPlan</literal>.
+ Plan nodes to which an <literal>InitPlan</literal> is attached.
+ </para>
+ </listitem>

Is this correct? See below example:

Serial-Plan
-----------------
postgres=# explain select * from t1 where t1.k=(select max(k) from t3);
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on t1 (cost=35.51..71.01 rows=10 width=12)
Filter: (k = $0)
InitPlan 1 (returns $0)
-> Aggregate (cost=35.50..35.51 rows=1 width=4)
-> Seq Scan on t3 (cost=0.00..30.40 rows=2040 width=4)
(5 rows)

Parallel-Plan
--------------------
postgres=# explain select * from t1 where t1.k=(select max(k) from t3);
QUERY PLAN
---------------------------------------------------------------------------------------
Gather (cost=9.71..19.38 rows=2 width=12)
Workers Planned: 2
Params Evaluated: $1
InitPlan 1 (returns $1)
-> Finalize Aggregate (cost=9.70..9.71 rows=1 width=4)
-> Gather (cost=9.69..9.70 rows=2 width=4)
Workers Planned: 2
-> Partial Aggregate (cost=9.69..9.70 rows=1 width=4)
-> Parallel Seq Scan on t3 (cost=0.00..8.75
rows=375 width=4)
-> Parallel Seq Scan on t1 (cost=0.00..9.67 rows=1 width=12)
Filter: (k = $1)
(11 rows)

In the above example, InitPlan is attached to a Plan node (Seq Scan
t1) which is not a parallel restricted.

> Other ideas?
>

How about changing the statement as:
- Access to an <literal>InitPlan</literal> or correlated
<literal>SubPlan</literal>.
+ Access to a correlated <literal>SubPlan</literal>.

I think we can cover InitPlan and Subplans that can be parallelized in
a separate section "Parallel Subplans" or some other heading. I think
as of now we have enabled parallel subplans and initplans in a
limited, but useful cases (as per TPC-H benchmark) and it might be
good to cover them in a separate section. I can come up with an
initial patch (or I can review it if you write the patch) if you and
or others think that makes sense.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-05-08 04:10:06 Re: [HACKERS] Parallel Append implementation
Previous Message Michael Paquier 2018-05-07 23:53:19 Re: perlcritic and perltidy