Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-23 12:48:59
Message-ID: CAFjFpRcMWwepj-Do1otxQ-GApGPSZ1FmH7YQvQTwzQOGczq_sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 23, 2017 at 1:17 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Mar 22, 2017 at 8:46 AM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> Here's set of updated patches rebased on
>> 1148e22a82edc96172fc78855da392b6f0015c88.
>>
>> I have fixed all the issues reported till now.
>
> I don't understand why patch 0001 ends up changing every existing test
> for RELOPT_JOINREL anywhere in the source tree to use IS_JOIN_REL(),
> yet none of the existing tests for RELOPT_OTHER_MEMBER_REL end up
> getting changed to use IS_OTHER_REL(). That's very surprising. Some
> of those tests are essentially checking for something that is going to
> have a scan plan rather than a join or upper plan, and those tests
> probably don't need to be modified; for example, the test in
> set_rel_consider_parallel() is obviously of this type. But others are
> testing whether we've got some kind of child rel, and those seem like
> they might need work. Going through a few specific examples:
>
> - generate_join_implied_equalities_for_ecs() assumes that any child
> rel is an other member rel.
> - generate_join_implied_equalities_broken() assumes that any child rel
> is an other member rel.

Fixed those.

> - generate_implied_equalities_for_column() set is_child_rel on the
> assumption that only an other member rel can be a child rel.

This function is called for indexes, which are not defined on the
join relations. So, we shouldn't worry about child-joins here. I have
added an assertion in there to make sure that that function gets
called only for base and "other" member rels.

> - eclass_useful_for_merging() assumes that the only kind of child rel
> is an other member rel.

This was being fixed in a later patch which had many small fixes for
handling child-joins. But now I have moved that fix into 0001.

> - find_childrel_appendrelinfo() assumes that any child rel is an other
> member rel.

The function is called for "other" member relation only. For joins we
use find_appinfos_by_relids() We could replace
find_childrel_appendrelinfo() with find_appinfos_by_relids(), which
does same thing as find_childrel_appendrelinfo() for a relids set. But
find_appinfos_by_relids() returns a list of AppendRelInfos, hence
using it instead of find_childrel_appendrelinfo() will spend some
memory and CPU cycles in creating a one element list and then
extracting that element out of the list. So, I have not replaced
usages of find_childrel_appendrelinfo() with
find_appinfos_by_relids(). This also simplifies changes to
get_useful_ecs_for_relation().

> - find_childrel_top_parent() and find_childrel_parents() assume that
> children must be other member rels and their parents must be baserels.

For partition-wise join implementation we save relids of topmost
parent in RelOptInfo of child. We can directly use that instead of
calling find_childrel_top_parent(). So, in 0001 I am adding
top_parent_relids in RelOptInfo and getting rid of
find_childrel_top_parent(). This also fixes
get_useful_ecs_for_relation() in a better way. find_childrel_parents()
is called only for simple relations not joins, since it's callers are
called only for simple relations. I have added an assertion to that
effect.

> - adjust_appendrel_attrs_multilevel() assumes that children must be
> other member rels and their parents must be baserels.

It was being fixed in a later patch. In the attached patch set 0001
changes it to use IS_OTHER_REL().

>
> It's possible that, for various reasons, none of these code paths
> would ever be reachable by a child join, but it doesn't look likely to
> me. And even if that's true, some comment updates are probably
> needed, and maybe some renaming of functions too.

Now commit messages of 0001 explains which instances of
RELOPT_OTHER_MEMBER_REL and RELOPT_BASEREL have been changed, and
which have been retained and why. Also, added assertions wherever
necessary.

>
> In postgres_fdw, get_useful_ecs_for_relation() assumes that any child
> rel is an other member rel. I'm not sure if we're hoping that
> partitionwise join will work with postgres_fdw's join pushdown out of
> the chute, but clearly this would need to be adjusted to have any
> chance of being right.

Fixed this as explained above.

>
> Some that seem OK:
>
> - set_rel_consider_parallel() is fine.
> - set_append_rel_size() is only going to be called for baserels or
> their children, so it's fine.
> - relation_excluded_by_constraints() is only intended to be called on
> baserels or their children, so it's fine.
> - check_index_predicates() is only intended to be called on baserels
> or their children, so it's fine.
> - query_planner() loops over baserels and their children, so it's fine.
>

Right.

> Perhaps we could introduce an IS_BASEREL_OR_CHILD() test that could be
> used in some of these places, just for symmetry.

I was wondering about this as well. Although, I though it better not
to touch base relations in partition-wise join. But now, I have added
that macro and adjusted corresponding tests in the code. See 0001.

You may actually want to squash 0001 and 0002 into a single patch. But
for now, I have left those as separate.

> The point is that
> there are really three questions here: (1) is it some kind of baserel
> (parent or child)? (2) is it some kind of joinrel (parent or child)?
> and (3) is it some kind of child (baserel or join)? Right now, both
> #2 and #3 are tested by just comparing against
> RELOPT_OTHER_MEMBER_REL, but they become different tests as soon as we
> add child joinrels. The goal of 0001, IMV, ought to be to try to
> figure out which of #1, #2, and #3 is being checked in each case and
> make that clear via use of an appropriate macro. (If is-other-baserel
> is the real test, then fine, but I bet that's a rare case.)

Agreed. I have gone through all the cases, and fixed the necessary
ones as explained above and in the commit messages of 0001.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_dp_join_patches_v13.zip application/zip 65.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-03-23 12:51:55 Re: GSOC'17 project introduction: Parallel COPY execution with errors handling
Previous Message Elvis Pranskevichus 2017-03-23 12:47:27 Re: [PATCH v1] Add and report the new "in_hot_standby" GUC pseudo-variable.