Skip site navigation (1) Skip section navigation (2)

Re: Solving hash table overrun problems

From: Aaron Birkland <birkie(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Solving hash table overrun problems
Date: 2005-03-04 06:05:37
Message-ID: 19ab0ccd050303220545ceb6e8@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
> We saw a case recently where a hash join was using much more memory than
> it was supposed to, causing failure when the server ran out of memory.

Yes.  I had the same problem a few month ago,
http://archives.postgresql.org/pgsql-general/2004-09/msg00410.php

It turned out that the cost estimates were so way off no matter what
tunables were modified, so I never was ever able to execute the query
fully.  I analyzed the code and devised a solution that was similar
what you proposed, though I didn't consider HashAggregates
at the time.  Unfortunately, I lost all work in a hard drive failure
and was never able to get back to working on it, so I can't really
refer to my old notes.  For what it's worth, your solution looks very
reasonable to me.

This also brings up a line of thought I had a while ago on a related
topic.  Something like a "HashDistinct" might be useful, if it had no
startup cost.  It would basically be a plan node in the executor that
would dynamically build a hashtable so that it can pull rows from its
child node (discarding if they appear in the hashtable) until it can
pass on a novel row.   I have some reservations about it, though.  At
best,
in queries with minimal startup cost from the get-go, it would seem to
be a tradeoff favoring latency over throughput (assuming the
HashDistinct would be a slower operation overall than separate
aggregation and distinct operations).   Then we have the issue of
really big hash tables...  I was hoping to get some time in the
upcoming months to hash out these issues to see if it's worth it, and
if it would be generally useful at all.

   -Aaron

In response to

Responses

pgsql-hackers by date

Next:From: Pailloncy Jean-GerardDate: 2005-03-04 08:38:29
Subject: Re: bitmap AM design
Previous:From: Vikram KalsiDate: 2005-03-04 06:01:59
Subject: postgreSQL-8.0.1 configure --enable-thread-safety with icc-8.1 on RHEL-AS3 Itanium-2 gives error

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group