Re: Question with hashed IN

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Question with hashed IN
Date: 2003-08-17 15:54:03
Message-ID: 11768.1061135643@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> On Sun, 17 Aug 2003, Tom Lane wrote:
>> That doesn't make any sense to me --- AFAICS, only the planner pays any
>> attention to reltuples, so it could only affect things via changing the
>> plan. Could we see details?

> I've included a perl file that generates data like that I was using and
> the output of the commands from that through psql -E on my machine. The
> times seem pretty repeatable in any order so caching and such doesn't seem
> to be playing a big part.

Oh, I see what it is. The initial sizing of the hash table (number of
buckets) is done using the planner's estimate of the number of rows out
of the subplan. In your later examples, the hash table is woefully
overloaded and so searching it takes longer (too many items on each
hash chain).

I'm not sure how important this is to work on. We could try to make the
executor's hash code more able to adapt when the hash table grows beyond
what it was expecting (by rehashing, etc) but personally I'd rather spend
the time on trying to improve the estimate to begin with.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ivan 2003-08-17 16:34:36 DOMAIN NEED CAST ?
Previous Message Tom Lane 2003-08-17 15:28:01 Re: compile error on cvs tip