From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Speeding up aggregates |
Date: | 2002-12-08 19:37:47 |
Message-ID: | 3DF39F8B.4050801@joeconway.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom Lane wrote:
> Note that even though there's no SORT, the sort_mem setting is used
> to determine the allowable hashtable size, so a too-small sort_mem
> might discourage the planner from selecting hashed aggregation.
> Use EXPLAIN to see which query plan gets chosen.
>
Just to follow up on my last post, I did indeed find that bumping up sort_mem
caused a switch back to HashAggregate, and a big improvement:
parts=# show sort_mem ;
sort_mem
----------
8192
(1 row)
parts=# set sort_mem to 32000;
SET
parts=# show sort_mem ;
sort_mem
----------
32000
(1 row)
parts=# explain analyze select i.part_id, sum(w.qty_oh) as total_oh from inv
i, iwhs w where i.part_id = w.part_id group by i.part_id having sum(w.qty_oh) > 0;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=5254.46..5432.10 rows=35528 width=36) (actual
time=1286.89..1399.36 rows=4189 loops=1)
Filter: (sum(qty_oh) > 0::double precision)
-> Hash Join (cost=1319.10..4710.31 rows=72553 width=36) (actual
time=163.36..947.54 rows=72548 loops=1)
Hash Cond: ("outer".part_id = "inner".part_id)
-> Seq Scan on iwhs w (cost=0.00..2121.53 rows=72553 width=22)
(actual time=0.01..266.20 rows=72553 loops=1)
-> Hash (cost=1230.28..1230.28 rows=35528 width=14) (actual
time=162.70..162.70 rows=0 loops=1)
-> Seq Scan on inv i (cost=0.00..1230.28 rows=35528
width=14) (actual time=0.04..88.98 rows=35528 loops=1)
Total runtime: 1443.93 msec
(8 rows)
parts=# set sort_mem to 8192;
SET
parts=# explain analyze select i.part_id, sum(w.qty_oh) as total_oh from inv
i, iwhs w where i.part_id = w.part_id group by i.part_id having sum(w.qty_oh) > 0;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=11111.93..12015.10 rows=35528 width=36) (actual
time=2836.98..3261.66 rows=4189 loops=1)
Filter: (sum(qty_oh) > 0::double precision)
-> Sort (cost=11111.93..11293.31 rows=72553 width=36) (actual
time=2836.73..2937.78 rows=72548 loops=1)
Sort Key: i.part_id
-> Hash Join (cost=1319.10..5254.45 rows=72553 width=36) (actual
time=155.42..1258.40 rows=72548 loops=1)
Hash Cond: ("outer".part_id = "inner".part_id)
-> Seq Scan on iwhs w (cost=0.00..2121.53 rows=72553
width=22) (actual time=0.01..308.57 rows=72553 loops=1)
-> Hash (cost=1230.28..1230.28 rows=35528 width=14) (actual
time=155.19..155.19 rows=0 loops=1)
-> Seq Scan on inv i (cost=0.00..1230.28 rows=35528
width=14) (actual time=0.02..86.82 rows=35528 loops=1)
Total runtime: 3281.75 msec
(10 rows)
So when it gets used, HashAggregate has provided a factor of two improvement
on this test case at least. Nice work, Tom!
Joe
From | Date | Subject | |
---|---|---|---|
Next Message | Kalle Barck-Holst | 2002-12-09 06:30:48 | is insertion and movement times are correlated to the size of the database? |
Previous Message | Joe Conway | 2002-12-08 19:31:54 | Re: Speeding up aggregates |