Re: [HACKERS] advanced partition matching algorithm for partition-wise join

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Date: 2018-02-26 10:03:21
Message-ID: CAFjFpResoxfp1rnV4Op9JOnG19VNEnjvjRN5DVd8QRHD+agTDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, Feb 23, 2018 at 7:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Feb 16, 2018 at 12:14 AM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> Appreciate you taking time for review.
>>
>> PFA updated version.
>
> Committed 0001.

Thanks.

Here's patchset rebased on the latest head. I have fixed all the
crashes and bugs reported till now.

One of the crash was related to the targetlist of child-join relation.
In basic partition-wise join, either all pairs of joining relations
can use partition-wise join technique or none can use it. So, the
joining pair which is used to create targetlist for a child-join
corresponds to the pair of joining relations used to create targetlist
for the parent join. With the restrictions related to missing
partitions discussed upthread, this isn't true with advanced partition
matching. The joining pair which is used to create targetlist for a
child-join may not correspond to the pair of joining relations used to
create targetlist for the parent join. Since these two pairs are
different build_joinrel_tlist() arranges the targetlist entries in
different order for child-join and parent-join. But for an appendrel,
we expect the parent and child targetlists in sync. So fix is: instead
of constructing the child-join targetlist from joining relations, we
construct it by translating the parent join. The basic partition-wise
join commit had changed build_joinrel_tlist() to handle child-joins
and it had changed set_append_rel_size() to compute attr_needed for
child relations so that that information can be used to built
child-join's targetlist. Both of those changes are not need because of
translation. I have added this fix as a separate patch with those two
changes reverted. When we will lift up the restrictions related to
missing partitions (I am not sure when), we will get back to a state
when joining pair which creates targetlist for child-join corresponds
to the joining pair which creates targetlist for the parent join. And
thus we don't need translation, which consumes more memory. We can use
attrs_needed information. So, may be we should retain those two
changes instead of reverting those. Any suggestions?

The extra extensive testcase advance_partition_join_test still fails
because of plan changes caused by recent changes to the append
costing. I have verified that those plan changes are fine and the
queries do not have any bugs or crashes. But that testcase has many
many queries and the plans are not stable. Since that testcase is not
meant for committing, I haven't fixed the plan diffs right now.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_adv_dp_join_patches_v6.tar.gz application/x-gzip 151.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2018-02-26 10:14:48 Re: csv format for psql
Previous Message Michael Paquier 2018-02-26 09:24:02 Re: [bug fix] pg_rewind takes long time because it mistakenly copies data files