Re: Partition-wise join for join between (declaratively) partitioned tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-22 17:16:52
Message-ID: CA+TgmoaqoVYqyekE3bKrTWoQuOhxnVij6-i7hCpf_4HyNmoi_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 22, 2017 at 9:59 AM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> In my testing last week, reparameterize_path_by_child() was essential
>> for nested loops to work properly, even without LATERAL. Without it,
>> the parameterized path ends up containing vars that reference the
>> parent varno instead of the child varno. That confused later planner
>> stages so that those Vars did not get replaced with Param during
>> replace_nestloop_params(), eventually resulting in a crash at
>> execution time.
>
> I half-described the solution. Sorry. Along-with disabling
> partition-wise lateral joins, we have to disable nested loop
> child-joins where inner child is parameterized by the parent of the
> outer one. We will still have nestloop join between parents where
> inner relation is parameterized by the outer and every child of inner
> is parameterized by the outer. But we won't create nest loop joins
> where inner child is parameterized by the outer child, where we
> require reparameterize_path_by_child. We will loose this optimization
> only till we get reparameterize_path_by_child() committed. Basically,
> in try_nestloop_path() (in the patch 0009), if
> (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent)), give up
> creating nest loop path. That shouldn't create any problems.
>
> Did you experiment with this change in try_nestloop_path()? Can you
> please share the testcase? I will take a look at it.

I didn't save the test case. It was basically just forcing a
partitionwise nestloop join between two equipartitioned tables, with
the calls to adjust_appendrel_attrs() ripped out of
reparameterize_path_by_child(), just to see what would break.

>> Based on that experiment, I think we could consider
>> having reparameterize_path_by_child() handle only scan paths as
>> reparameterize_path() does, and just give up on plans like this:
>>
>> Append
>> -> Left Join
>> -> Scan on a
>> -> Inner Join
>> -> Index Scan on b
>> -> Index Scan on c
>> [repeat for each partition]
>>
>
> I am assuming that a, b and c are partitions of A, B and C resp. which
> are being joined and both or one of the scans on b and c are
> parameteried by a or scan of c is parameterized by b.

Yes.

> I don't think we will get away by supporting just scan paths, since
> the inner side of lateral join can be any paths not just scan path. Or
> you are suggesting that we disable partition-wise lateral join and
> support reparameterization of only scan paths?

I think if you can do a straight-up partitionwise nested loop between
two tables A and B, that's pretty bad. But if there are more complex
cases that involve parameterizing entire join trees which aren't
covered, that's less bad. Parallel query almost entirely punts on
LATERAL right now, and nobody's complained yet. I'm sure that'll need
to get fixed someday, but not today.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-03-22 17:21:20 Re: patch proposal
Previous Message Tom Lane 2017-03-22 17:15:41 Re: WIP: Faster Expression Processing v4