Re: BUG #2930: Hash join abyssmal with many null fields.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Maciej Babinski <maciej(at)apathy(dot)killer-robot(dot)net>
Cc: Maciej Babinski <maciej+postgres(at)apathy(dot)killer-robot(dot)net>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #2930: Hash join abyssmal with many null fields.
Date: 2007-01-26 17:22:26
Message-ID: 25727.1169832146@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Maciej Babinski <maciej(at)apathy(dot)killer-robot(dot)net> writes:
> Tom Lane wrote:
>> I see no bug here. AFAICT your "much faster" query gets that way by
>> having eliminated all the candidate join rows on the B side.

> The additional clause eliminates no rows beyond what the existing
> clause would. Any row eliminated by "b.join_id IS NOT NULL" could not
> possibly have satisfied "a.join_id = b.join_id".

Hmm. You assume that the = operator is strict, which is probably true,
but the hash join code isn't assuming that.

It might be worth checking for the case, though. What's happening,
since we go ahead and put the null rows into the hash table, is that
they all end up in the same hash chain because they all get hash code 0.
And then that very long chain gets searched for each null outer row.
If we knew the join operator is strict we could discard null rows
immediately on both sides.

> Please note that if the join columns are not null, but still produce
> no matches for the join, the results are fast without the need for an
> extra clause in the join:

Yeah, because the rows get reasonably well distributed into different
hash buckets. The optimizer will avoid a hash if it sees the data is
not well-distributed, but IIRC it's not considering nulls when it
makes that decision.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Maciej Babinski 2007-01-26 17:47:45 Re: BUG #2930: Hash join abyssmal with many null fields.
Previous Message Maciej Babinski 2007-01-26 16:15:25 Re: BUG #2930: Hash join abyssmal with many null fields.