pgsql: Fix hash table size estimation error in choose_hashed_distinct()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Fix hash table size estimation error in choose_hashed_distinct()
Date: 2013-08-21 17:38:47
Message-ID: E1VCCMp-0004Ob-2M@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix hash table size estimation error in choose_hashed_distinct().

We should account for the per-group hashtable entry overhead when
considering whether to use a hash aggregate to implement DISTINCT. The
comparable logic in choose_hashed_grouping() gets this right, but I think
I omitted it here in the mistaken belief that there would be no overhead
if there were no aggregate functions to be evaluated. This can result in
more than 2X underestimate of the hash table size, if the tuples being
aggregated aren't very wide. Per report from Tomas Vondra.

This bug is of long standing, but per discussion we'll only back-patch into
9.3. Changing the estimation behavior in stable branches seems to carry too
much risk of destabilizing plan choices for already-tuned applications.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/3454876314f0711894599f56e42ac99082b4e38f

Modified Files
--------------
src/backend/optimizer/plan/planner.c | 4 ++++
1 file changed, 4 insertions(+)

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2013-08-21 22:32:13 pgsql: Disable -faggressive-loop-optimizations in gcc 4.8+ for pre-9.2
Previous Message Bruce Momjian 2013-08-21 11:33:25 pgsql: docs: Remove second 'trim' index reference