Re: Partition-wise join for join between (declaratively) partitioned tables

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-09-14 02:57:04
Message-ID: ad18d62d-8e7b-5411-b164-a2b2580ae1a4@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/09/14 7:43, Robert Haas wrote:
> On Wed, Sep 13, 2017 at 12:56 PM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> I debugged what happens in case of query "select 1 from t1 union all
>> select 2 from t1;" with the current HEAD (without multi-level
>> expansion patch attached). It doesn't set partitioned_rels in Append
>> path that gets converted into Append plan. Remember t1 is a
>> multi-level partitioned table here with t1p1 as its immediate
>> partition and t1p1p1 as partition of t1p1. So, the
>> set_append_rel_pathlist() recurses once as shown in the following
>> stack trace.
>
> Nice debugging.

+1.

> I spent some time today looking at this and I think
> it's a bug in v10, and specifically in add_paths_to_append_rel(),
> which only sets partitioned_rels correctly when the appendrel is a
> partitioned rel, and not when it's a subquery RTE with one or more
> partitioned queries beneath it.
>
> Attached are two patches either one of which will fix it. First, I
> wrote mechanical-partrels-fix.patch, which just mechanically
> propagates partitioned_rels lists from accumulated subpaths into the
> list used to construct the parent (Merge)AppendPath. I wasn't entire
> happy with that, because it ends up building multiple partitioned_rels
> lists for the same RelOptInfo. That seems silly, but there's no
> principled way to avoid it; avoiding it amounts to hoping that all the
> paths for the same relation carry the same partitioned_rels list,
> which is uncomfortable.
>
> So then I wrote pcinfo-for-subquery.patch. That patch notices when an
> RTE_SUBQUERY appendrel is processed and accumulates the
> partitioned_rels of its immediate children; in case there can be
> multiple nested levels of subqueries before we get down to the actual
> partitioned rel, it also adds a PartitionedChildRelInfo for the
> subquery RTE, so that there's no need to walk the whole tree to build
> the partitioned_rels list at higher levels, just the immediate
> children. I find this fix a lot more satisfying. It adds less code
> and does no extra work in the common case.

I very much like pcinfo-for-subquery.patch, although I'm not sure if we
need to create PartitionedChildRelInfo for the sub-query parent RTE as the
patch teaches add_paths_to_append_rel() to do. ISTM, nested UNION ALL
subqueries are flattened way before we get to add_paths_to_append_rel();
if it could not be flattened, there wouldn't be a call to
add_paths_to_append_rel() in the first place, because no AppendRelInfos
would be generated. See what happens when is_simple_union_all_recurse()
returns false to flatten_simple_union_all() -- no AppendRelInfos will be
generated and added to root->append_rel_list in that case.

IOW, there won't be nested AppendRelInfos for nested UNION ALL sub-queries
like we're setting out to build for multi-level partitioned tables.

So, as things stand today, there can at most be one recursive call of
add_path_to_append_rel() for a sub-query parent RTE, that is, if its child
sub-queries contain partitioned tables, but not more. The other patch
(multi-level expansion of partitioned tables) will change that, but even
then we won't need sub-query's own PartitioendChildRelInfo.

> Notice that the choice of fix we adopt has consequences for your
> 0001-Multi-level-partitioned-table-expansion.patch -- with
> mechanical-partrels-fix.patch, that patch could either associated all
> partitioned_rels with the top-parent or it could work level by level
> and everything would get properly assembled later. But with
> pcinfo-for-subquery.patch, we need everything associated with the
> top-parent. That doesn't seem like a problem to me, but it's
> something to note.

I think it's fine.

With 0001-Multi-level-partitioned-table-expansion.patch,
get_partitioned_child_rels() will get called even for non-root partitioned
tables, for which it won't find a valid pcinfo. I think that patch must
also change its callers to stop Asserting that a valid pcinfo is returned.

Spotted a typo in pcinfo-for-subquery.patch:

+ * A plain relation will alread have

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2017-09-14 03:13:50 Re: PG 10 release notes
Previous Message Noah Misch 2017-09-14 02:49:54 Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b