Re: Partition-wise join for join between (declaratively) partitioned tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-09-13 22:43:33
Message-ID: CA+TgmoaKd7-HM7e6d3T7gKeP4E2shoZo__-wimp=9fiXBdD3NQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 13, 2017 at 12:56 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> I debugged what happens in case of query "select 1 from t1 union all
> select 2 from t1;" with the current HEAD (without multi-level
> expansion patch attached). It doesn't set partitioned_rels in Append
> path that gets converted into Append plan. Remember t1 is a
> multi-level partitioned table here with t1p1 as its immediate
> partition and t1p1p1 as partition of t1p1. So, the
> set_append_rel_pathlist() recurses once as shown in the following
> stack trace.

Nice debugging. I spent some time today looking at this and I think
it's a bug in v10, and specifically in add_paths_to_append_rel(),
which only sets partitioned_rels correctly when the appendrel is a
partitioned rel, and not when it's a subquery RTE with one or more
partitioned queries beneath it.

Attached are two patches either one of which will fix it. First, I
wrote mechanical-partrels-fix.patch, which just mechanically
propagates partitioned_rels lists from accumulated subpaths into the
list used to construct the parent (Merge)AppendPath. I wasn't entire
happy with that, because it ends up building multiple partitioned_rels
lists for the same RelOptInfo. That seems silly, but there's no
principled way to avoid it; avoiding it amounts to hoping that all the
paths for the same relation carry the same partitioned_rels list,
which is uncomfortable.

So then I wrote pcinfo-for-subquery.patch. That patch notices when an
RTE_SUBQUERY appendrel is processed and accumulates the
partitioned_rels of its immediate children; in case there can be
multiple nested levels of subqueries before we get down to the actual
partitioned rel, it also adds a PartitionedChildRelInfo for the
subquery RTE, so that there's no need to walk the whole tree to build
the partitioned_rels list at higher levels, just the immediate
children. I find this fix a lot more satisfying. It adds less code
and does no extra work in the common case.

Notice that the choice of fix we adopt has consequences for your
0001-Multi-level-partitioned-table-expansion.patch -- with
mechanical-partrels-fix.patch, that patch could either associated all
partitioned_rels with the top-parent or it could work level by level
and everything would get properly assembled later. But with
pcinfo-for-subquery.patch, we need everything associated with the
top-parent. That doesn't seem like a problem to me, but it's
something to note.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
mechanical-partrels-fix.patch application/octet-stream 7.7 KB
pcinfo-for-subquery.patch application/octet-stream 3.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Henry 2017-09-13 22:46:29 Re: [RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?
Previous Message Tom Lane 2017-09-13 21:53:44 Re: uninterruptible state in 10beta4