Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-22 12:46:40
Message-ID: CAFjFpRefs5ZMnxQ2vP9v5zOtWtNPuiMYc01sb1SWjCOB1CT=uQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Here's set of updated patches rebased on
1148e22a82edc96172fc78855da392b6f0015c88.

I have fixed all the issues reported till now.

I have also completed reparameterize_path_by_child() for all the
required paths. There's no TODO there now. :) The function has grown
quite long now and might take some time to review. Given the size, I
am wondering whether we should separate that fix from the main
partition-wise join fix. That will make reviewing that function
easier, allowing a careful review. Here's the idea how that can be
done. As explained in the commit of 0009, the function is required in
case of lateral joins between partitioned relations. For a A LATERAL
JOIN B, B is the minimum required parameterization by A. Hence
children of A i.e. A1, A2 ... all require their paths to be
parameterized by B. When that comes to partition-wise joins, A1
requires its paths to be parameterized by B1 (matching partition from
B). Otherwise we can not create paths for A1B1. This means that we
require to reparameterize all A1's paths to be reparameterized by B1
using function reparameterize_paths_by_child(). So the function needs
to support reparameterization of all the paths; we do not know which
of those have survived add_path(). But if we disable partition-wise
join for lateral joins i.e. when direct_lateral_relids of one relation
contains the any subset of the relids in the other relation, we do not
need reparameterize_path_by_child(). Please let me know if this
strategy will help to make review and commit easier.

After the commit,
commit d3cc37f1d801a6b5cad9bf179274a8d767f1ee50
Author: Robert Haas <rhaas(at)postgresql(dot)org>
Date: Tue Mar 21 09:48:04 2017 -0400

Don't scan partitioned tables.

We do not create any AppendRelInfos and hence RelOptInfos for the
partitioned tables. My approach to attach multi-level partitioned join
was to store RelOptInfos of immediate partitions in part_rels of
RelOptInfo of a partitioned table, thus maintaining a tree of
RelOptInfos reflecting partitioning tree. This allows to add append
paths to intermediate RelOptInfos, flattening them as we go up the
partitioning hierarchy. With no RelOptInfos for intermediate
partitions, we can support multi-level partition-wise join only in
limited cases when the partitioning hierarchy of the joining table
exactly matches. Please refer [1] for some more discussion.

I think we need the RelOptInfos for the partitions, which are
partitioned to hold the "append" paths containing paths from their
children and to match the partitions in partition-wise join. Similar
hierarchy will be created for partitioned joins, with partitioned join
partitions. So, I have not changed the multi-level partition-wise join
support patches. After applying 0011-0013 the multi-level partitioning
tests fail with error "could not find the RelOptInfo of a partition
with oid", since it does not find the RelOptInfos of partitions which
are partitioned.

[1] https://www.postgresql.org/message-id/CAFjFpRceMmx26653XFAYvc5KVQcrzcKScVFqZdbXV%3DkB8Akkqg@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_dp_join_patches_v12.zip application/zip 60.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-22 12:47:24 Re: Monitoring roles patch
Previous Message Stephen Frost 2017-03-22 12:46:05 Re: increasing the default WAL segment size