Re: Partition-wise join for join between (declaratively) partitioned tables

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-07-19 04:24:12
Message-ID: CAOGQiiNwEZ3hepxCAYL5pxvHMTQWfY-8x1R3dAxS=HRUrVG0vA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, Jul 14, 2017 at 12:32 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>
>
> Here's revised patch set with only 0004 revised. That patch deals with
> creating multi-level inheritance hierarchy from multi-level partition
> hierarchy. The original logic of recursively calling
> inheritance_planner()'s guts over the inheritance hierarchy required
> that for every such recursion we flatten many lists created by that
> code. Recursion also meant that root->append_rel_list is traversed as
> many times as the number of partitioned partitions in the hierarchy.
> Instead the revised version keep the iterative shape of
> inheritance_planner() intact, thus naturally creating flat lists,
> iterates over root->append_rel_list only once and is still easy to
> read and maintain.
>
On testing this patch for TPC-H (for scale factor 20) benchmark I found a
regression for Q21, on head it was taking some 600 seconds and with this
patch it is taking 3200 seconds. This comparison is on the same partitioned
database, one using the partition wise join patch and other is without it.
The execution time of Q21 on unpartitioned head is some 300 seconds. The
explain analyse output for each of these cases is attached.

This suggests that partitioning is not a suitable strategy for this query,
but then may be partition wise should not be picked for such a case to
aggravate the performance issue.

The details of the setup is as follows,

Server parameter settings,
work_mem - 1GB
effective_cache_size - 8GB
shared_buffers - 8GB
enable_partition_wise_join = on

Partition information:
Type of partitioning - single column range partition
Tables partitioned - Lineitem and orders

Lineitem -
Partition key = l_orderkey
No of partitions = 18

Orders -
Partition key = o_orderkey
No of partitions = 11

Commit id - 42171e2cd23c8307bbe0ec64e901f58e297db1c3

I chose orderkey as the partition key since it is the primary key of orders
and along with l_linenumber it forms the primary key for lineitem.
For the above mentioned settings, there was no other query that used
partitioned wise join.

Please let me know if any more information is required regarding this
experimentation.

--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachment Content-Type Size
part_reg.zip application/zip 8.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2017-07-19 05:04:39 Re: [PATCH] Make sure all statistics is sent after a few DML are performed
Previous Message Mark Dilger 2017-07-19 04:13:10 Re: Something for the TODO list: deprecating abstime and friends