Re: Speeding up aggregates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Speeding up aggregates
Date: 2002-12-09 22:06:50
Message-ID: 20088.1039471610@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Joe Conway <mail(at)joeconway(dot)com> writes:
>> How many rows out if you drop the HAVING clause?

> parts=# set sort_mem to 8000;
> SET
> parts=# explain analyze select i.part_id, sum(w.qty_oh) as total_oh from inv
> i, iwhs w where i.part_id = w.part_id group by i.part_id;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=5617.22..5706.04 rows=35528 width=36) (actual
> time=1525.93..1627.41 rows=34575 loops=1)
> -> Hash Join (cost=1319.10..5254.45 rows=72553 width=36) (actual
> time=156.86..1248.73 rows=72548 loops=1)

>> The planner's choice of which to use is dependent on its estimate of the
>> required hashtable size, which is proportional to its guess about how
>> many distinct groups there will be. The above output doesn't tell us
>> that however, only how many groups passed the HAVING clause. I'm
>> curious about the quality of this estimate, since the code to try to
>> generate not-completely-bogus group count estimates is all new ...

> If I'm reading it correctly, it looks like the estimate in this case is pretty
> good.

Better than I had any right to expect ;-). Thanks.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2002-12-09 22:14:30 Re: questions about disk configurations
Previous Message Joe Conway 2002-12-09 21:48:21 Re: Speeding up aggregates