Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-08-07 13:05:15
Message-ID: CAFjFpRcncUKUgGgpOM40rTa4CHqruQvLRcxhXqU_TNxh7_L3jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 3, 2017 at 9:38 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> On Thu, Aug 3, 2017 at 2:10 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Mon, Jul 31, 2017 at 7:59 AM, Ashutosh Bapat
>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>> Adding AppendRelInfos to root->append_rel_list as and when they are
>>> created would keep parent AppendRelInfos before those of children. But
>>> that function throws away the AppendRelInfo it created when their are
>>> no real children i.e. in partitioned table's case when has no leaf
>>> partitions. So, we can't do that. Hence, I chose to change the API to
>>> return the list of AppendRelInfos when the given RTE has real
>>> children.
>>
>> So, IIUC, the case you're concerned about is when you have a hierarchy
>> of only partitioned tables, with no plain tables. For example, if B
>> is a partitioned table and a partition of A, and that's all there is,
>> A will recurse to B and B will return NIL.
>>
>> Is it necessary to get rid of the extra AppendRelInfos, or are they
>> harmless like the duplicate RTE and PlanRowMark nodes?
>>
>
> Actually there are two sides to this:
>
> If there are no leaf partitions, without the patch two things happen
> 1. rte->inh is cleared and 2 no appinfo is added to the
> root->append_rel_list, even though harmless RTE and PlanRowMark nodes
> are created. The first avoids treating the relation as the inheritance
> parent and thus avoids creating any child relations and paths, saving
> a lot of work. Ultimately set_rel_size() marks such a relation as
> dummy
> 352 else if (rte->relkind == RELKIND_PARTITIONED_TABLE)
> 353 {
> 354 /*
> 355 * A partitioned table without leaf
> partitions is marked
> 356 * as a dummy rel.
> 357 */
> 358 set_dummy_rel_pathlist(rel);
> 359 }
>
> Since root->append_rel_list is traversed for every inheritance parent,
> not adding needless AppendRelInfos improves performance and saves
> memory, (FWIW or consider a case where there are thousands of
> partitioned partitions without any leaf partition.).

With some testing, I found that this was true once, but not after
declarative partition support. Please check [1].

>
> My initial thought was to keep both these properties intact. But then
> removing such AppendRelInfos would have a problem when such a table is
> on the inner side of the join as described in [1]. So I wrote the
> patch not to do either of those things when there are partitioned
> partitions without leaf partitions. So, it looks like you are correct,
> we could just go ahead and add those AppendRelInfos directly to
> root->append_rel_list.

Irrespective of [1], I have implemented your idea of not changing
signature of expand_inherited_rtentry() with following idea.

>>
>> If we do need to get rid of the extra AppendRelInfos, maybe a less
>> invasive solution would be to change the if (!need_append) case to do
>> root->append_rel_list = list_truncate(root->append_rel_list,
>> original_append_rel_length).

[1] https://www.postgresql.org/message-id/CAFjFpReWJr1yTkHU=OqiMBmcYCMoSW3VPR39RBuQ_ovwDFBT5Q@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-08-07 13:12:22 Re: Pluggable storage
Previous Message Ashutosh Sharma 2017-08-07 12:37:01 free space % calculation in pgstathashindex