Quick Links

Re: Really bad blowups with hash outer join and nulls

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc:	pgsql-hackers(at)postgresql(dot)org, gavin(at)shopwindow(dot)me
Subject:	Re: Really bad blowups with hash outer join and nulls
Date:	2015-02-15 15:33:25
Message-ID:	17667.1424014405@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> A quick test suggests that initializing the hash value to ~0 rather than
> 0 has a curious effect: the number of batches still explodes, but the
> performance does not suffer the same way. (I think because almost all
> the batches end up empty.) I think this is worth doing even in the
> absence of a more general solution; nulls are common enough and
> important enough that they shouldn't be the worst-case value if it can
> be avoided.

I think that's unlikely to be a path to a good solution.

At least part of the problem here is that estimate_hash_bucketsize()
supposes that nulls can be ignored --- which is normally true, and
invalidates your claim that they're common. But in a RIGHT JOIN
situation, they need to be considered as if they were regular keys.
That would probably be sufficient to dissuade the planner from choosing
a hash join in this example.

There may also be something we can do in the executor, but it would
take closer analysis to figure out what's going wrong. I don't think
kluging the behavior for NULL in particular is the answer.

regards, tom lane

In response to

Really bad blowups with hash outer join and nulls at 2015-02-15 09:54:21 from Andrew Gierth

Responses

Re: Really bad blowups with hash outer join and nulls at 2015-02-15 23:50:37 from Andrew Gierth

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Florian Pflug	2015-02-15 16:14:21	Re: restrict global access to be readonly
Previous Message	Petr Jelinek	2015-02-15 14:25:59	Re: Logical Replication Helpers WIP for discussion