Re: bad plan: 8.4.8, hashagg, work_mem=1MB.

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: bad plan: 8.4.8, hashagg, work_mem=1MB.
Date: 2011-06-20 19:31:05
Message-ID: BANLkTike3P_0RSd--dZBcAZ+MKcy3ieDqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Jun 20, 2011 at 11:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> writes:
>> I ran a query recently where the result was very large. The outer-most
>> part of the query looked like this:
>
>>  HashAggregate  (cost=56886512.96..56886514.96 rows=200 width=30)
>>    ->  Result  (cost=0.00..50842760.97 rows=2417500797 width=30)
>
>> The row count for 'Result' is in the right ballpark, but why does
>> HashAggregate think that it can turn 2 *billion* rows of strings (an
>> average of 30 bytes long) into only 200?
>
> 200 is the default assumption about number of groups when it's unable to
> make any statistics-based estimate.  You haven't shown us any details so
> it's hard to say more than that.

What sorts of details would you like? The row count for the Result
line is approximately correct -- the stats for all tables are up to
date (the tables never change after import). statistics is set at 100
currently.

--
Jon

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Sushant Sinha 2011-06-21 02:25:34 Re: sequential scan unduly favored over text search gin index
Previous Message Tomas Vondra 2011-06-20 19:09:34 Re: how to know slowly query in lock postgre