Re: Partition-wise join for join between (declaratively) partitioned tables

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-09-01 12:35:32
Message-ID: 31895.1504269332@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> I originally thought to provide it along-with the changes to
> expand_inherited_rtentry(), but that thread is taking longer. Jeevan
> Chalke needs rebased patches for his work on aggregate pushdown and
> Thomas might need them for further review. So, here they are.

Since I have related patch in the current commitfest
(https://commitfest.postgresql.org/14/1247/), I spent some time reviewing your
patch:

* generate_partition_wise_join_paths()

Right parenthesis is missing in the prologue.

* get_partitioned_child_rels_for_join()

I think the Assert() statement is easier to understand inside the loop, see
the assert.diff attachment.

* have_partkey_equi_join()

As the function handles generic join, this comment doesn't seem to me
relevant:

/*
* The equi-join between partition keys is strict if equi-join between
* at least one partition key is using a strict operator. See
* explanation about outer join reordering identity 3 in
* optimizer/README
*/
strict_op = op_strict(opexpr->opno);

And I think the function can return true even if strict_op is false for all
the operators evaluated in the loop.

* match_expr_to_partition_keys()

I'm not sure this comment is clear enough:

/*
* If it's a strict equi-join a NULL partition key on one side will
* not join a NULL partition key on the other side. So, rows with NULL
* partition key from a partition on one side can not join with those
* from a non-matching partition on the other side. So, search the
* nullable partition keys as well.
*/
if (!strict_op)
continue;

My understanding of the problem of NULL values generated by outer join is:
these NULL values --- if evaluated by non-strict expression --- can make row
of N-th partition on one side of the join match row(s) of *other than* N-th
partition(s) on the other side. Thus the nullable input expressions may only
be evaluated by strict operators. I think it'd be clearer if you stressed that
(undesired) *match* of partition keys can be a problem, rather than mismatch.

If you insist on your wording, then I think you should at least move the
comment below to the part that only deals with strict operators.

* There are several places where lfirst_node() macro should be used. For
example

rel = lfirst_node(RelOptInfo, lc);

instead of

rel = (RelOptInfo *) lfirst(lc);

* map_and_merge_partitions()

Besides a few changes proposed in map_and_merge_partitions.diff (a few of them
to suppress compiler warnings) I think that this part needs more thought:

{
Assert(mergemap1[index1] != mergemap2[index2] &&
mergemap1[index1] >= 0 && mergemap2[index2] >= 0);

/*
* Both the partitions map to different merged partitions. This
* means that multiple partitions from one relation matches to one
* partition from the other relation. Partition-wise join does not
* handle this case right now, since it requires ganging multiple
* partitions together (into one RelOptInfo).
*/
merged_index = -1;
}

I could hit this path with the following test:

CREATE TABLE a(i int) PARTITION BY LIST(i);
CREATE TABLE a_0 PARTITION OF a FOR VALUES IN (0, 2);
CREATE TABLE b(j int) PARTITION BY LIST(j);
CREATE TABLE b_0 PARTITION OF b FOR VALUES IN (1, 2);

SET enable_partition_wise_join TO on;

SELECT *
FROM a
FULL JOIN
b ON i = j;

I don't think there's a reason not to join a_0 partition to b_0, is there?

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

Attachment Content-Type Size
assert.diff text/x-diff 959 bytes
map_and_merge_partitions.diff text/x-diff 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2017-09-01 12:35:39 Re: [PATCH] Improve geometric types
Previous Message Aleksander Alekseev 2017-09-01 12:28:02 Re: [PATCH] Off-by-one error in logical slot resource retention