Re: Proposal : Parallel Merge Join

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal : Parallel Merge Join
Date: 2017-03-04 07:29:26
Message-ID: CA+TgmoYsysszgexuqUKwwewpGuOb3r+tQUET=q2c7qQQBaBgsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 3, 2017 at 9:47 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I'm not happy with the way this patch can just happen to latch on to a
>> path that's not parallel-safe rather than one that is and then just
>> give up on a merge join in that case. I already made this argument in
>> https://www.postgresql.org/message-id/CA+TgmobdW2au1Jq5L4ySA2ZhqFmA-qNvD7ZFaZbJWm3c0ysWyw@mail.gmail.com
>> and my opinion hasn't changed.
>
> I think last time I did not understand the depth of the problem
> completely and only fixed from one aspect that in
> generate_partial_mergejoin_paths if cheapest_total_inner or
> cheapest_startup_inner is not parallel safe then consider the current
> path if that are parallel safe and now I got it how it was completely
> wrong.
>
> I have one question for fixing it in sort_inner_and_outer, Currently,
> we don't consider the parameterized paths for merge join except the
> case when cheapest total paths itself is parameterized, So IIUC, for
> creating partial path we will check if cheapest_total_inner path is
> not parallel safe then we will find cheapest inner parallel safe path
> using your new API get_cheapest_parallel_safe_total_inner, and we will
> proceed with this paths if this is not directly parameterized by
> outer?

Remember that partial paths can't be parameterized, and here we're
trying to build a partial mergejoin path. The outer input obviously
won't be parameterized, since it's partial, and it can't satisfy any
parameterization of the inner relation either, because only nested
loops can do that. So it only makes sense to try merge-joining a
partial path against a completely unparameterized, parallel-safe inner
path.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-04 07:31:40 Re: 2017-03 Commitfest In Progress
Previous Message Robert Haas 2017-03-04 07:24:12 Re: Disallowing multiple queries per PQexec()