Re: parallelize queries containing initplans

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallelize queries containing initplans
Date: 2017-08-09 12:59:13
Message-ID: CAJrrPGdMbAjaFOaJrqDV3wSJMoSPqixwVxjtvk5y7BJ3jKpXEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 9, 2017 at 8:54 PM, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
wrote:

> On Wed, Aug 9, 2017 at 10:24 AM, Haribabu Kommi
> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
> >
> > I tested the latest patch and the parallel plan is getting choose for
> most
> > of
> > the init plans.
> >
> Thanks for testing.
>
> > For the following query the parallel plan is not chosen. The query
> contains
> > an init plan that refer the outer node.
> >
> > postgres=# explain analyze select * from t1 where t1.i in (select t2.i
> from
> > t2 where t1.k = (select max(k) from t3 where t3.i=t1.i));
> > QUERY PLAN
> > ------------------------------------------------------------
> ---------------------------------------------------------
> > Seq Scan on t1 (cost=0.00..22426.28 rows=448 width=12) (actual
> > time=8.335..132.557 rows=2 loops=1)
> > Filter: (SubPlan 2)
> > Rows Removed by Filter: 894
> > SubPlan 2
> > -> Result (cost=16.27..31.26 rows=999 width=4) (actual
> > time=0.146..0.146 rows=0 loops=896)
> > One-Time Filter: (t1.k = $1)
> > InitPlan 1 (returns $1)
> > -> Aggregate (cost=16.25..16.27 rows=1 width=4) (actual
> > time=0.145..0.145 rows=1 loops=896)
> > -> Seq Scan on t3 (cost=0.00..16.25 rows=2 width=4)
> > (actual time=0.131..0.144 rows=0 loops=896)
> > Filter: (i = t1.i)
> > Rows Removed by Filter: 900
> > -> Seq Scan on t2 (cost=16.27..31.26 rows=999 width=4)
> (actual
> > time=0.012..0.013 rows=10 loops=2)
> > Planning time: 0.272 ms
> > Execution time: 132.623 ms
> > (14 rows)
> >
> An observation is that the filter at Result node can't be pushed down
> to the sequential scan on t2 because the filter is on t1. So, it has
> to scan the complete t2 relation and send all the tuple to upper node,
> a worst case for parallelism. Probably, this is the reason the
> optimizer doesn't pick parallel plan for the above case.
>
> Just for clarification, do you see any changes in the plan after
> forcing parallelism(parallel_tuple_cost, parallel_setup_cost,
> min_parallel_table_scan_size=0)?

There is no plan change with parallel* GUC changes.

Regards,
Hari Babu
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Finzel 2017-08-09 13:14:22 Review of DDL replication solution
Previous Message Aleksander Alekseev 2017-08-09 12:47:02 Re: Cache lookup errors with functions manipulation object addresses