Re: TPC-H Q20 from 1 hour to 19 hours!

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TPC-H Q20 from 1 hour to 19 hours!
Date: 2017-03-31 11:53:43
Message-ID: CAOGQiiOkSxHzv0eAhTboKxLMa2=0Z28S83gScMcqkUjOt==rdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 31, 2017 at 5:13 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Thu, Mar 30, 2017 at 8:24 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Mar 29, 2017 at 8:00 PM, Tomas Vondra
>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>> What is however strange is that changing max_parallel_workers_per_gather
>>> affects row estimates *above* the Gather node. That seems a bit, um,
>>> suspicious, no? See the parallel-estimates.log.
>>
>> Thanks for looking at this! Comparing the parallel plan vs. the
>> non-parallel plan:
>>
>> part: parallel rows (after Gather) 20202, non-parallel rows 20202
>> partsupp: parallel rows 18, non-parallel rows 18
>> part-partsupp join: parallel rows 88988, non-parallel rows 355951
>> lineitem: parallel rows 59986112, non-parallel rows 59986112
>> lineitem after aggregation: parallel rows 5998611, non-parallel rows 5998611
>> final join: parallel rows 131, non-parallel rows 524
>>
>> I agree with you that that looks mighty suspicious. Both the
>> part-partsupp join and the final join have exactly 4x as many
>> estimated rows in the non-parallel plan as in the parallel plan, and
>> it just so happens that the parallel divisor here will be 4.
>>
>> Hmm... it looks like the parallel_workers value from the Gather node
>> is being erroneously propagated up to the higher levels of the plan
>> tree. Wow. Somehow, Gather Merge managed to get the logic correct
>> here, but Gather is totally wrong. Argh. Attached is a draft patch,
>> which I haven't really tested beyond checking that it passes 'make
>> check'.
>>
>
> Your patch looks good to me. I have verified some join cases as well
> where the behaviour is sane after patch. I have also done testing
> with force_parallel_mode=regress (ran make check-world) and everything
> seems good.
>
> --

I tried checking the plan of Q20 with this patch, and got the following results,
with patch,
-> Merge Join (cost=3025719.98..3499235.22 rows=6 width=16) (actual
time=176440.801..245903.143 rows=118124 loops=1)
Merge Cond: ((lineitem.l_partkey =
partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey))
Join Filter:
((partsupp.ps_availqty)::numeric > ((0.5 * sum(lineitem.l_quantity))))
and without patch,
-> Merge Join (cost=3014830.12..3511637.54 rows=2 width=16) (actual
time=130994.237..208887.149 rows=118124 loops=1)
Merge Cond: ((partsupp.ps_partkey =
lineitem.l_partkey) AND (partsupp.ps_suppkey = lineitem.l_suppkey))
Join Filter:
((partsupp.ps_availqty)::numeric > ((0.5 * sum(lineitem.l_quantity))))

So, it looks like in the problematic area, it is not improving much.
Please find the attached file for the query plan of Q20 with and
without patch. I haven't evaluated the performance of this query with
patch.
--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Attachment Content-Type Size
q20_after_gather_fix_patch.txt text/plain 10.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-03-31 12:50:57 Re: postgres_fdw bug in 9.6
Previous Message Simon Riggs 2017-03-31 11:53:34 Re: Patch: Write Amplification Reduction Method (WARM)