Re: hashagg, statistisics and excessive memory allocation

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Subject: Re: hashagg, statistisics and excessive memory allocation
Date: 2006-05-11 19:04:20
Message-ID: 200605111204.21771.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stefan,

> Some testing showed that the planner was seriously underestimating the
> number of distinct rows in the table (with the default statistic target
> it estimated ~150k rows while there are about 19M distinct values) and
> chosing a hashagg for the aggregate.
> uping the statistics target to 1000 improves the estimate to about 5M
> rows which unfortunably is still not enough to cause the planner to
> switch to a groupagg with work_mem set to 256000.

Well, it's pretty well-known that we need to fix n-distinct estimation.
But we also need to handle it gracefully if the estimate is still wrong
and we start using too much memory. Is there any way we can check how
much memory the hashagg actually *is* using and spill to disk if it goes
over work_mem?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Campbell 2006-05-11 19:11:03 Re: Compiling on 8.1.3 on Openserver 5.05
Previous Message Martijn van Oosterhout 2006-05-11 18:43:46 Re: [HACKERS] Big IN() clauses etc : feature proposal