Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-17 10:28:18
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On Thu, Mar 16, 2017 at 8:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Mar 16, 2017 at 6:48 AM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>> I thought the whole point here was that NOT doing that caused the
>>> memory usage for partitionwise join to get out of control. Am I
>>> confused?
>> We took a few steps to reduce the memory footprint of partition-wise
>> join in [1] and [2]. According to the numbers reported in [1] and then
>> in [2], if the total memory consumed by a planner is 44MB (memory
>> consumed by paths 150K) for a 5-way non-parition-wise join between
>> tables with 1000 partitions, partition-wise join consumed 192MB which
>> is 4.4 times the non-partitino-wise case. The earlier implementation
>> of blowing away a memory context after each top-level child-join, just
>> got rid of the paths created for that child-join. The total memory
>> consumed by paths created for all the child-joins was about 150MB.
>> Remember that we can not get rid of memory consumed by expressions,
>> RelOptInfos, RestrictInfos etc. since their pointers will be copied
>> into the plan nodes.
> All right, I propose that we revise our plan for attacking this
> problem. The code in this patch that proposes to reduce memory
> utilization is very complicated and it's likely to cause us to miss
> this release altogether if we keep hacking on it. So, I propose that
> you refactor this patch series so that the first big patch is
> partition-wise join without any of the optimizations that save memory
> - essentially the sample_partition_fraction = 1 case with all
> memory-saving optimizations removed. If it's only there to save
> memory, rip it out. Also, change the default value of
> enable_partition_wise_join to false and document that turning it on
> may cause a large increase in planner memory utilization, and that's
> why it's not enabled by default.
> If we get that committed, then we can have follow-on patches that add
> the incremental path creation stuff and other memory-saving features,
> and then at the end we can flip the default from "off" to "on".
> Probably that last part will slip beyond v10 since we're only two
> weeks from the end of the release cycle, but I think that's still
> better than having everything slip. Let's also put the multi-level
> partition-wise join stuff ahead of the memory-saving stuff, because
> being able to do only a single-level of partition-wise join is a
> fairly unimpressive feature; I'm not sure this is really even
> committable without that.
> I realize in some sense that I'm telling you to go and undo all of the
> work that you just did based on what I told you before, but I think
> we've actually made some pretty good progress here: it's now clear
> that there are viable strategies for getting the memory usage down to
> an acceptable level, and we've got draft patches for those strategies.
> So committing the core feature without immediately including that work
> can't be regarded as breaking everything hopelessly; rather, it now
> looks (I think, anyway) like a reasonable intermediate step towards
> the eventual goal.

Here's the set of patches with all the memory saving stuff removed.
It's now bare partition-wise joins. I have tried to eliminate all
memory saving stuff carefully, except few bms_free() and list_free()
which fit the functions they were part of and mostly were present in
my earlier versions of patches. But I might have missed some. Also, I
have corrected any indentation/white space mistakes introduced by
editing patches with +/-, but I might have missed some. Please let me

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size application/zip 65.7 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2017-03-17 10:37:16 Re: UPDATE of partition key
Previous Message Nikolay Shaplov 2017-03-17 10:22:25 Re: [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM