Re: Partition-wise join for join between (declaratively) partitioned tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-22 13:02:10
Message-ID: CA+TgmobedycFv-nVC===Y3-LAex5dHw4uT32Z=0L+PL3cVA9Kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 22, 2017 at 8:46 AM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> I have also completed reparameterize_path_by_child() for all the
> required paths. There's no TODO there now. :) The function has grown
> quite long now and might take some time to review. Given the size, I
> am wondering whether we should separate that fix from the main
> partition-wise join fix. That will make reviewing that function
> easier, allowing a careful review. Here's the idea how that can be
> done. As explained in the commit of 0009, the function is required in
> case of lateral joins between partitioned relations. For a A LATERAL
> JOIN B, B is the minimum required parameterization by A. Hence
> children of A i.e. A1, A2 ... all require their paths to be
> parameterized by B. When that comes to partition-wise joins, A1
> requires its paths to be parameterized by B1 (matching partition from
> B). Otherwise we can not create paths for A1B1. This means that we
> require to reparameterize all A1's paths to be reparameterized by B1
> using function reparameterize_paths_by_child(). So the function needs
> to support reparameterization of all the paths; we do not know which
> of those have survived add_path(). But if we disable partition-wise
> join for lateral joins i.e. when direct_lateral_relids of one relation
> contains the any subset of the relids in the other relation, we do not
> need reparameterize_path_by_child(). Please let me know if this
> strategy will help to make review and commit easier.

In my testing last week, reparameterize_path_by_child() was essential
for nested loops to work properly, even without LATERAL. Without it,
the parameterized path ends up containing vars that reference the
parent varno instead of the child varno. That confused later planner
stages so that those Vars did not get replaced with Param during
replace_nestloop_params(), eventually resulting in a crash at
execution time. Based on that experiment, I think we could consider
having reparameterize_path_by_child() handle only scan paths as
reparameterize_path() does, and just give up on plans like this:

Append
-> Left Join
-> Scan on a
-> Inner Join
-> Index Scan on b
-> Index Scan on c
[repeat for each partition]

But I doubt we can get by without it altogether.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-03-22 13:06:19 Re: Logical decoding on standby
Previous Message Andres Freund 2017-03-22 13:00:07 Metadata about relation creation & full scans.