Re: parallelize queries containing subplans

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallelize queries containing subplans
Date: 2017-01-19 09:49:36
Message-ID: CAGz5QCLB=Gw8D0rbaUV-kCaTxPSBN_Fz2yRKjCi21BcuP0BK2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 19, 2017 at 3:05 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> @@ -1213,6 +1216,7 @@ SS_process_ctes(PlannerInfo *root)
> &splan->firstColCollation);
> splan->useHashTable = false;
> splan->unknownEqFalse = false;
> + splan->parallel_safe = best_path->parallel_safe;
>
> I noticed that if path for CTE is parallel safe then we are marking
> CTE subplan as parallel safe, In particular, I don't have any problem
> with that, but can you add some test case which can cover this path, I
> mean to say where CTE subplan are pushed.
>
> ------------
> I have tried to test the subplan with CTE below is my test.
> create table t1(a int , b varchar);
> create table t (n int, b varchar);
>
> Query:
> explain verbose select * from t where t.n not in (WITH RECURSIVE t(n) AS (
> VALUES (1)
> UNION ALL
> SELECT a+1 FROM t1 WHERE a < 100
> )
> SELECT sum(n) FROM t);
>
> During debugging I found that subplan created for below part of the
> query is parallel_unsafe, Is it a problem or there is some explanation
> of why it's not parallel_safe,
>
> (WITH RECURSIVE t(n) AS (
> VALUES (1)
> UNION ALL
> SELECT a+1 FROM t1 WHERE a < 100
> )
> SELECT sum(n) FROM t);
> ----------
The corresponding plan for the query you have specified is:

QUERY PLAN
----------------------------------------------------------------------------------
Seq Scan on public.t (cost=40.73..20894.74 rows=500480 width=35)
Output: t.n, t.b
Filter: (NOT (hashed SubPlan 2))
SubPlan 2
-> Aggregate (cost=40.72..40.73 rows=1 width=8)
Output: sum(t_1.n)
CTE t
-> Append (cost=0.00..31.18 rows=424 width=4)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: 1
-> Seq Scan on public.t1 (cost=0.00..26.93
rows=423 width=4)
Output: (t1.a + 1)
Filter: (t1.a < 100)
-> CTE Scan on t t_1 (cost=0.00..8.48 rows=424 width=4)
Output: t_1.n
(15 rows)

Now, the plan for CTE is parallel_safe. But, a CTE plan is converted
to an InitPlan and returns a Param which is used in the CTE Scan.
Since Param is not parallel_safe till now, the SubPlan is also not
parallel_safe. This is why CTE subplans will not be pushed under
Gather.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kuntal Ghosh 2017-01-19 10:12:26 Re: parallelize queries containing subplans
Previous Message Kyotaro HORIGUCHI 2017-01-19 09:37:31 Re: Bug in Physical Replication Slots (at least 9.5)?