Re: Partition-wise join for join between (declaratively) partitioned tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-29 03:09:11
Message-ID: CA+TgmobAyODvH2UJk-Sqfk+26L2jXmxuyKVM8j0REuRmcQMTXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 28, 2017 at 12:54 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Regarding 0002, I think the parts that involve factoring out
> find_param_path_info() are uncontroversial. Regarding the changes to
> adjust_appendrel_attrs(), my main question is whether we wouldn't be
> better off using an array representation rather than a List
> representation. In other words, this function could take PlannerInfo
> *root, Node *node, int nappinfos, AppendRelInfo **appinfos. Existing
> callers doing adjust_appendrel_attrs(root, whatever, appinfo) could
> just do adjust_appendrel_attrs(root, whatever, 1, &appinfo), not
> needing to allocate. To make this work, adjust_child_relids() and
> find_appinfos_by_relids() would need to be adjusted to use a similar
> argument-passing convention. I suspect this makes iterating over the
> AppendRelInfos mildly faster, too, apart from the memory savings.

Still regarding 0002, looking at adjust_appendrel_attrs_multilevel,
could we have a common code path for the baserel and joinrel cases?
It seems like perhaps you could just loop over root->append_rel_list.
For each appinfo, if (bms_is_member(appinfo->child_relid,
child_rel->relids)) bms_add_member(parent_relids,
appinfo->parent_relid).

This implementation would have some loss of efficiency in the
single-rel case because we'd scan all of the AppendRelInfos in the
list even if there's only one relid. But you could fix that by
writing it like this:

foreach (lc, root->append_rel_list)
{
if (bms_is_member(appinfo->child_relid, child_rel->relids))
{
bms_add_member(parent_relids, appinfo->parent_relid);
if (child_rel->reloptkind == RELOPT_OTHER_MEMBER_REL)
break; /* only one relid to find, and we've found it */
}
}
Assert(bms_num_members(child_rel->relids) == bms_num_members(parent_relids));

That seems pretty slick. It is just as fast as the current
implementation for the single-rel case. It allocates no memory
(unlike what you've got now). And it handles the joinrel case using
essentially the same code as the simple rel case.

In 0003, it seems that it would be more consistent with what you did
elsewhere if the last argument to allow_star_schema_join were named
inner_paramrels rather than innerparams. Other than that, I don't see
anything to complain about.

In 0004:

+ Assert(!rel->part_rels[cnt_parts]);
+ rel->part_rels[cnt_parts] = childrel;

break here?

+static void
+get_relation_partition_info(PlannerInfo *root, RelOptInfo *rel,
+ Relation
relation, bool inhparent)
+{
+ /* No partitioning information for an unpartitioned relation. */
+ if (relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE ||
+ !inhparent ||

I still think the inhparent check should be moved to the caller.

In 0005:

+ * Returns a list of the RT indexes of the partitioned
child relations
+ * with any of joining relations' rti as the root parent RT index.

I found this wording confusing. Maybe: Build and return a list
containing the RTI of every partitioned relation which is a child of
some rel included in the join.

+ * Note: Only call this function on joins between partitioned tables.

Or what, the boogeyman will come and get you?

(In other words, I don't think that's a very informative comment.)

I don't think 0011 is likely to be acceptable in current form. I
can't imagine that we just went to the trouble of getting rid of
AppendRelInfos for child partitioned rels only to turn around and put
them back again. If you just need the parent-child mappings, you can
get that from the PartitionedChildRelInfo list.

Unfortunately, I don't think we're likely to be able to get this whole
patch series into a committable form in the next few days, but I'd
like to keep reviewing it and working with you on it; there's always
next cycle.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-29 03:16:58 Re: Schedule and Release Management Team for PG10
Previous Message Tomas Vondra 2017-03-29 03:08:11 Re: crashes due to setting max_parallel_workers=0