Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-14 12:04:27
Message-ID: CAFjFpRfgc__y_gUhvv2C2nmL-WyDyTEOYaedM5wDUuCDRYiY4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review.

>
> Some very high-level thoughts based on a look through these patches:
>
> In 0001, you've removed a comment about how GEQO needs special
> handling, but it doesn't look as if you've made any compensating
> change elsewhere. That seems unlikely to be correct. If GEQO needs
> some paths to survive longer than others, how can it be right for this
> code to create them all in the same context?

Thanks for pointing that out. I have replaced the code and the
comments back. There was another issue that the temporary paths
created by geqo will not be freed when geqo moves one genetic string
to next genetic string (or what it calls as tour: an list of relation
to be joined in a given order). To fix this, we need to set the path
context to the temporary context of geqo inside geqo_eval() before
calling gimme_tree() and reset it later. That way the temporary paths
are also created in the temporary memory context of geqo. Fixed in the
patch.

> Incidentally,
> geqo_eval() seems to be an existing precedent for the idea of throwing
> away paths and RelOptInfos, so we might want to use similar code for
> partitionwise join.

There are some differences in what geqo does and what partition-wise
needs to do. geqo tries many joining orders each one in a separate
temporary context. The way geqo slices the work, every slice produces
a full plan. For partition-wise join I do not see a way to slice the
work such that the whole path and corresponding RelOptInfos come from
the same slice. So, we can't use the same method as GEQO.

It's worth noticing that paths are created twice for the cheapest
joining order that it finds: once in the trial phase and second time
when the final plan is created. The second time, the paths,
RelOptInfos, expressions used by the final plan are in the context as
the plan.

>
> 0002 and 0003 look OK.
>
> Probably 0004 is OK too, although that seems to be adding some
> overhead to existing callers for the benefit of new ones. Might be
> insignificant, though.

Yes, the overhead is to add and extract the appinfo from a list when
there is only one appinfo. We may optimize it by passing appinfo
directly when there's only one and pass list when there are more, but
that is complicating the code unnecessarily. The overhead seems to be
worth the cost to keep the code simpler.

>
> 0005 looks OK, except that add_join_rel's definition is missing a
> "static" qualifier. That's not just cosmetic; based on previous
> expereince, this will break the BF.

Thanks for pointing it out. Done.

>
> 0006 seems to be unnecessary; the new function isn't used in later patches.

It's required by 0011 - reparameterize_path_by_child(). BTW, I need to
know whether reparameterize_path_by_child() looks good, so that I can
complete it by adding support for all kinds of path in that function.

>
> Haven't looked at 0007 yet.
>
> 0008 is, as previously mentioned, more than we probably want to commit.

I agree, and I will work on that.

>
> Haven't looked at 0009 yet.
>
> 0010 - 0012 seem to be various fixes which would need to be done
> before or along with 0009, rather than afterward, so I am confused
> about the ordering of those patches in the patch series.

They are needed only when we have 0009. But when those are clubbed
with 0009, it makes 0009 review difficult as the code for those fixes
mixes with the code for partition-wise support. So, I have separated
those out into patches categorized by functionality. Reviewer may then
apply 0009 and see what failures each of the changes in 0010-0012
fixes, if required. They need to be committed along-with 0009.

>
> The commit message for 0013 is a bit unclear about what it's doing,
> although I can guess, a bit, based on the commit message for 0007.
>

This is preparatory patch for 0015 which supports partition-wise join
for multi-level partitioned tables. We have discussed about
partition-wise join support for multi-level partitioned tables in [1].
We may decide to postpone patches 0013-0015 to v11, if this gets too
much for v10.

[1] https://www.postgresql.org/message-id/CAFjFpRceMmx26653XFAYvc5KVQcrzcKScVFqZdbXV%3DkB8Akkqg@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-03-14 12:04:38 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Ashutosh Sharma 2017-03-14 12:02:37 Re: Microvacuum support for Hash Index