Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-09-15 11:59:21
Message-ID: CAFjFpRcT4DRCc1tON1pDEfixi0pp=5Sp2zcuypBFHwOcDcgvRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 15, 2017 at 2:09 PM, Rafia Sabih
<rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
> On TPC-H benchmarking of this patch, I found a regression in Q7. It
> was taking some 1500s with the patch and some 900s without the patch.
> Please find the attached pwd_reg.zip for the output of explain analyse
> on head and with patch.
>
> The experimental settings used were,
> commit-id = 0c504a80cf2e6f66df2cdea563e879bf4abd1629
> patch-version = v26
>
> Server settings:
> work_mem = 1GB
> shared_buffers = 10GB
> effective_cache_size = 10GB
> max_parallel_workers_per_gather = 4
>
> Partitioning information:
> Partitioning scheme = by range
> Number of partitions in lineitem and orders table = 106
> partition key for lineitem = l_orderkey
> partition key for orders = o_orderkey

I observe that with partition-wise join patch the planner is using
GatherMerge along-with partition-wise join and on head its not using
GatherMerge. Just to make sure that its partition-wise join which is
causing regression and not GatherMerge, can you please run the query
with enable_gathermerge = false?

I see following lines explain analyze output 7_1.out without the patch
-> Sort (cost=84634030.40..84638520.55 rows=1796063
width=72) (actual time=1061001.435..1061106.608 rows=437209 loops=1)
Sort Key: n1.n_name, n2.n_name,
(date_part('year'::text, (lineitem_001.l_shipdate)::timestamp without
time zone))
Sort Method: quicksort Memory: 308912kB
-> Hash Join (cost=16080591.94..84447451.72
rows=1796063 width=72) (actual time=252745.701..1057447.219
rows=1749956 loops=1)
Since Sort doesn't filter any rows, we would expect it to output the
same number of rows as hash join underneath it. But the number of rows
differ in this case. I am wondering whether there's some problem with
the explain analyze output itself.

>
> Apart from these there is a regression case on a custom table, on head
> query completes in 20s and with this patch it takes 27s. Please find
> the attached .out and .sql file for the output and schema for the test
> case respectively. I have reported this case before (sometime around
> March this year) as well, but I am not sure if it was overlooked or is
> an unimportant and expected behaviour for some reason.
>

Are you talking about [1]? I have explained about the regression in
[2] and [3]. This looks like an issue with the existing costing model.

[1] https://www.postgresql.org/message-id/CAOGQiiMwcjNrunJ_fCDBscrTLeJ-CLp7exfzzipe2ut71n4LUA@mail.gmail.com
[2] https://www.postgresql.org/message-id/CAFjFpRedUZPa7tKbCLEGK3u5UWdDNQoN=eYfb7ieG5d0D1PbsQ@mail.gmail.com
[3] https://www.postgresql.org/message-id/CAFjFpReJKSdCfaeuZjGD79hOETzpz5BKDxLJgxr7qznrXX+TRw@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kuntal Ghosh 2017-09-15 12:13:35 Re: More efficient truncation of pg_stat_activity query strings
Previous Message Daniel Gustafsson 2017-09-15 11:37:37 Re: VACUUM and ANALYZE disagreeing on what reltuples means