Re: postmaster consuming /lots/ of memory with hash aggregate. why?

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: postmaster consuming /lots/ of memory with hash aggregate. why?
Date: 2010-11-12 15:33:21
Message-ID: AANLkTikv9BT3=5C1aXCJomej0gQxTkv3TX1DeiMjA-P-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Nov 11, 2010 at 10:38 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/11/12 Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>:
>> On Thu, Nov 11, 2010 at 10:26 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> Hello
>>>
>>> look on EXPLAIN ANALYZE command. Probably your statistic are out, and
>>> then planner can be confused. EXPLAIN ANALYZE statement show it.
>>
>> As I noted earlier, I did set statistics to 1000 an re-ran vacuum
>> analyze and the plan did not change.
>
> this change can do nothing. this is default in config. did you use
> ALTER TABLE ALTER COLUMN SET STATISTIC = ... ? and ANALYZE

No. To be clear: are you saying that changing the value for
default_statistics_target, restarting postgresql, and re-running
VACUUM ANALYZE does *not* change the statistics for columns
created/populated *prior* to the sequence of operations, and that one
/must/ use ALTER TABLE ALTER COLUMN SET STATISTICS ... and re-ANALYZE?

That does not jive with the documentation, which appears to suggest
that setting a new default_statistics_target, restarting postgresql,
and then re-ANALYZE'ing a table should be sufficient (provided the
columns have not had a statistics target explicitly set).

>> What other diagnostics can I provide? This still doesn't answer the
>> 40000 row question, though. It seems absurd to me that the planner
>> would give up and just use 40000 rows (0.02 percent of the actual
>> result).
>>
>
> there can be some not well supported operation, then planner use a
> some % from rows without statistic based estimation

The strange thing is that the value 40000 keeps popping up in totally
diffferent contexts, with different tables, databases, etc... I tried
digging through the code and the only thing I found was that numGroups
was being set to 40000 but I couldn't see where.

--
Jon

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message bricklen 2010-11-12 15:34:36 Re: MVCC performance issue
Previous Message Thom Brown 2010-11-12 13:54:57 Re: MVCC performance issue