Pushdown target list below gather node (WAS Re: WIP: Upper planner pathification)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Pushdown target list below gather node (WAS Re: WIP: Upper planner pathification)
Date: 2016-03-16 07:09:56
Message-ID: CAA4eK1Jk8hm-2j-CKjvdd0CZTsdPX=EdK_qhzc4689hq0xtfMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 9, 2016 at 11:58 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Mar 9, 2016 at 12:33 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Gather is a bit weird, because although it can project (and needs to,
> > per the example of needing to compute a non-parallel-safe function),
> > you would rather push down as much work as possible to the child node;
> > and doing so is semantically OK for parallel-safe functions. (Pushing
> > functions down past a Sort node, for a counterexample, is not so OK
> > if you are concerned about function evaluation order, or even number
> > of executions.)
> >
> > In the current code structure it would perhaps be reasonable to teach
> > apply_projection_to_path about that --- although this would require
> > logic to separate parallel-safe and non-parallel-safe subexpressions,
> > which doesn't quite seem like something apply_projection_to_path
> > should be doing.
>
> I think for v1 it would be fine to make this all-or-nothing; that's
> what I had in mind to do. That is, if the entire tlist is
> parallel-safe, push it all down. If not, let the workers just return
> the necessary Vars and have Gather compute the final tlist.
>

I find it quite convenient to teach apply_projection_to_path() to push down
target-list beneath Gather node, when targetlist contains parallel-safe
expression. Attached patch implements pushing targetlist beneath gather
node.

Below is output of a simple test which shows the effect of implementation.

Without Patch -
------------------------
postgres=# explain verbose select c1+2 from t1 where c1<10;
QUERY PLAN
-----------------------------------------------------------------------------
Gather (cost=0.00..44420.43 rows=30 width=4)
Output: (c1 + 2)
Number of Workers: 2
-> Parallel Seq Scan on public.t1 (cost=0.00..44420.35 rows=13 width=4)
Output: c1
Filter: (t1.c1 < 10)
(6 rows)

With Patch -
-----------------------
postgres=# explain verbose select c1+2 from t1 where c1<10;
QUERY PLAN
-----------------------------------------------------------------------------
Gather (cost=0.00..45063.75 rows=30 width=4)
Output: ((c1 + 2))
Number of Workers: 1
-> Parallel Seq Scan on public.t1 (cost=0.00..45063.68 rows=18 width=4)
Output: (c1 + 2)
Filter: (t1.c1 < 10)
(6 rows)

In the above plans, you can notice that target list expression (c1 + 2) is
pushed beneath Gather node after patch.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
parallel-tlist-pushdown-v1.patch application/octet-stream 1.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2016-03-16 07:25:42 Odd oid-system-column handling in postgres_fdw
Previous Message Amit Kapila 2016-03-16 06:44:51 Re: [WIP] speeding up GIN build with parallel workers