Re: Partition-wise join for join between (declaratively) partitioned tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-15 01:21:05
Message-ID: CA+TgmoZyns3FYX5CJVtZWXXn_nLkaV1NhUNG9DvhCVwKQN81jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 14, 2017 at 8:33 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Of course, that supposes that 0009 can manage to postpone creating
> non-sampled child joinrels until create_partition_join_plan(), which
> it currently doesn't. In fact, unless I'm missing something, 0009
> hasn't been even slightly adapted to take advantage of the
> infrastructure in 0001; it doesn't seem to reset the path_cxt or
> anything. That seems like a fairly major omission.

Some other comments on 0009:

Documentation changes for the new GUCs are missing.

+between the partition keys of the joining tables. The equi-join between
+partition keys implies that for a given row in a given partition of a given
+partitioned table, its joining row, if exists, should exist only in the
+matching partition of the other partitioned table; no row from non-matching

There could be more than one. I'd write: The equi-join between
partition keys implies that all join partners for a given row in one
partitioned table must be in the corresponding partition of the other
partitioned table.

+#include "miscadmin.h"
#include <limits.h>
#include <math.h>

Added in wrong place.

+ * System attributes do not need
translation. In such a case,
+ * the attribute numbers of the parent
and the child should
+ * start from the same minimum attribute.

I would delete the second sentence and add an Assert() to that effect instead.

+ /* Pass top parent's relids down the inheritance hierarchy. */

Why?

+ for (attno = rel->min_attr; attno <=
rel->max_attr; attno++)

Add add a comment explaining why we need to do this.

- add_paths_to_append_rel(root, rel, live_childrels);
+ add_paths_to_append_rel(root, rel, live_childrels, false);
}

-

No need to remove blank line.

+ * When called on partitioned join relation with partition_join_path = true, it
+ * adds PartitionJoinPath instead of Merge/Append path. This path is costed
+ * based on the costs of sampled child-join and is expanded later into
+ * Merge/Append plan.

I'm not a big fan of the Merge/Append terminology here. If somebody
adds another kind of append-path someday, then all of these comments
will have to be updated. I think this can be phrased more
generically.

/*
+ * While creating PartitionJoinPath, we sample paths from only
a few child
+ * relations. Even if all of sampled children have partial
paths, it's not
+ * guaranteed that all the unsampled children will have partial paths.
+ * Hence we do not create partial PartitionJoinPaths.
+ */

Very sad. I guess if we had parallel append available, we could maybe
dodge this problem, but for now I suppose we're stuck with it.

+ /*
+ * Partitioning scheme in join relation indicates a possibility that the
+ * join may be partitioned, but it's not necessary that every pair of
+ * joining relations can use partition-wise join technique. If one of
+ * joining relations turns out to be unpartitioned, this pair of joining
+ * relations can not use partition-wise join technique.
+ */
+ if (!rel1->part_scheme || !rel2->part_scheme)
+ return;

How can this happen? If rel->part_scheme != NULL, doesn't that imply
that every rel covered by the joinrel is partitioned that way, and
therefore this condition must necessarily hold?

In general, I think it's better style to write explicit tests against
NULL or NIL than to just write if (blahptr).

+ partitioned_join->sjinfo = copyObject(parent_sjinfo);

Why do we need to copy it?

+ /*
+ * Remove the relabel decoration. We can assume that there is
at most one
+ * RelabelType node; eval_const_expressions() simplifies multiple
+ * RelabelType nodes into one.
+ */
+ if (IsA(expr, RelabelType))
+ expr = (Expr *) ((RelabelType *) expr)->arg;

Still, instead of assuming this, you could just s/if/while/, and then
you wouldn't need the assumption any more. Also, consider castNode().

partition_wise_plan_weight may be useful for testing, but I don't
think it should be present in the final patch.

This is not a full review; I ran out of mental energy before I got to
the end. (Sorry.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-03-15 01:39:35 Re: GSOC Introduction / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
Previous Message Robert Haas 2017-03-15 00:33:00 Re: Partition-wise join for join between (declaratively) partitioned tables