Re: Partition-wise join for join between (declaratively) partitioned tables

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-09-22 05:15:47
Message-ID: CAOGQiiP7dfdG4JgCtnJMz-ww0a15NitdjF6qjxV7SWmno6DMpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 19, 2017 at 2:58 PM, Rafia Sabih
<rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
> On Fri, Sep 15, 2017 at 2:09 PM, Rafia Sabih
> <rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
>>>
>> On TPC-H benchmarking of this patch, I found a regression in Q7. It
>> was taking some 1500s with the patch and some 900s without the patch.
>> Please find the attached pwd_reg.zip for the output of explain analyse
>> on head and with patch.
>>
>> The experimental settings used were,
>> commit-id = 0c504a80cf2e6f66df2cdea563e879bf4abd1629
>> patch-version = v26
>>
>> Server settings:
>> work_mem = 1GB
>> shared_buffers = 10GB
>> effective_cache_size = 10GB
>> max_parallel_workers_per_gather = 4
>>
>> Partitioning information:
>> Partitioning scheme = by range
>> Number of partitions in lineitem and orders table = 106
>> partition key for lineitem = l_orderkey
>> partition key for orders = o_orderkey
>>
>> Apart from these there is a regression case on a custom table, on head
>> query completes in 20s and with this patch it takes 27s. Please find
>> the attached .out and .sql file for the output and schema for the test
>> case respectively. I have reported this case before (sometime around
>> March this year) as well, but I am not sure if it was overlooked or is
>> an unimportant and expected behaviour for some reason.
>>
>
> On completing the benchmark for all queries for the above mentioned
> setup, following performance improvement can be seen,
> Query | Patch | Head
> 3 | 1455 | 1631
> 4 | 499 | 4344
> 5 | 1464 | 1606
> 10 | 1475 | 1599
> 12 | 1465 | 1790
>
> Note that all values of execution time are in seconds.

I compared this experiment with non-partitioned database and following
is the result,
Query | Non-partitioned head
3 | 1752
4 | 315
5 | 2319
10 | 1535
12 | 1739

In summary, the query that appears slowest in partitioned database is
not so otherwise. It is good to see that in Q4 partition-wise join
helps in achieving performance closer to it's non-partitioned case,
otherwise partitioning alone causes it to suffer greatly. Apart from
Q4 it does not looks like partitioning hurts anywhere else, though the
maximum improvement is ~35% for Q5.
Another point to note here is that the performance on partitioned and
unpartitioned heads are quite close (except Q4) which is something
atleast I wasn't expecting. It looks like we need not to partition the
tables anyway, or atleast this set of queries doesn't benefit from
partitioning. Please let me know if somebody has better ideas on how
partitioning schemes should be applied to make it more beneficial for
these queries.

--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachment Content-Type Size
pg_unpart.zip application/zip 27.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-09-22 05:52:53 Re: analyzeCTE is too strict about typmods?
Previous Message Schneider 2017-09-22 04:52:12 Re: Generate wait event list and docs from text file