BUG #15926: TPCH Q18 slow: explain plan with 2 step Hash Aggregate (partial and finalize)

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: sylvie(dot)empereur-mot(at)bull(dot)net
Subject: BUG #15926: TPCH Q18 slow: explain plan with 2 step Hash Aggregate (partial and finalize)
Date: 2019-07-26 13:16:50
Message-ID: 15926-33afe9ab9466bcfa@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15926
Logged by: Sylvie Empereur
Email address: sylvie(dot)empereur-mot(at)bull(dot)net
PostgreSQL version: 12beta2
Operating system: Linux RedHat Entreprise 7.6
Description:

Dear all,

While running a TPCH(at)100GB benchmark with PostgreSQL12beta2, I observe long
runtime for query Q18 ( 558 sec) on NVMe storage + power9 cpus.

The TPCH(at)100GB Q18 explain plan shows Hash Aggregate in 2 steps (partial and
finalize HashAggregate).

Current 2017, this type of performance degradation had been mentionned for
TPCH(at)10GB with PostgreSQL11
https://www.postgresql-archive.org/Performance-degradation-in-TPC-H-Q18-td5945660.html

The discussion in 2017 mentionned reducing the memory area work_mem to avoid
2 step Hash Aggregate.
Can you please provide best recommendations for version 12.

The good news is that all 22 queries TPCH(at)100GB, execute in less than
45min.

Thank you,
Sylvie

postgres=# SHOW WORK_MEM;
4GB

Limit (cost=36577180.95..36577181.20 rows=100 width=71)
-> Sort (cost=36577180.95..36578542.97 rows=544808 width=71)
Sort Key: orders.o_totalprice DESC, orders.o_orderdate
-> GroupAggregate (cost=36544100.60..36556358.78 rows=544808
width=71)
Group Key: customer.c_custkey, orders.o_orderkey
-> Sort (cost=36544100.60..36545462.62 rows=544808
width=44)
Sort Key: customer.c_custkey, orders.o_orderkey
-> Hash Join (cost=16987987.80..36492192.96
rows=544808 width=44)
Hash Cond: (lineitem.l_orderkey =
orders.o_orderkey)
-> Seq Scan on lineitem (cost=0.00..17248747.24
rows=600002624 width=9)
-> Hash (cost=16986285.48..16986285.48
rows=136185 width=43)
-> Hash Join
(cost=15738629.49..16986285.48 rows=136185 width=43)
Hash Cond: (orders.o_custkey =
customer.c_custkey)
-> Nested Loop
(cost=15042922.49..16290221.00 rows=136185 width=24)
-> Finalize GroupAggregate
(cost=15042921.92..15150514.46 rows=136185 width=4)
Group Key:
lineitem_1.l_orderkey
Filter:
(sum(lineitem_1.l_quantity) > '313'::numeric)
-> Gather Merge
(cost=15042921.92..15138257.84 rows=817108 width=36)
Workers Planned:
2
-> Sort
(cost=15041921.90..15042943.28 rows=408554 width=36)
Sort Key:
lineitem_1.l_orderkey
-> Partial
HashAggregate (cost=14998737.40..15003844.32 rows=408554 width=36)
Group
Key: lineitem_1.l_orderkey
->
Parallel Seq Scan on lineitem lineitem_1 (cost=0.00..13748731.93
rows=250001093 width=9)
-> Index Scan using
orders_pkey on orders (cost=0.57..8.36 rows=1 width=20)
Index Cond: (o_orderkey =
lineitem_1.l_orderkey)
-> Hash (cost=508243.11..508243.11
rows=14997111 width=23)
-> Seq Scan on customer
(cost=0.00..508243.11 rows=14997111 width=23)
(28 rows)

Browse pgsql-bugs by date

  From Date Subject
Next Message milad moradi 2019-07-26 13:55:17 Error CREATE EXTENSION plpythonu
Previous Message PG Bug reporting form 2019-07-26 07:39:17 BUG #15925: Loss of precision converting money to numeric