Re: Really bad blowups with hash outer join and nulls

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, gavin(at)shopwindow(dot)me
Subject: Re: Really bad blowups with hash outer join and nulls
Date: 2015-02-15 23:50:37
Message-ID: 87egpqlqnh.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> A quick test suggests that initializing the hash value to ~0 rather
>> than 0 has a curious effect: the number of batches still explodes,
>> but the performance does not suffer the same way. (I think because
>> almost all the batches end up empty.) I think this is worth doing
>> even in the absence of a more general solution; nulls are common
>> enough and important enough that they shouldn't be the worst-case
>> value if it can be avoided.

Tom> I think that's unlikely to be a path to a good solution.

It wasn't really intended to be.

Tom> At least part of the problem here is that
Tom> estimate_hash_bucketsize() supposes that nulls can be ignored ---
Tom> which is normally true, and invalidates your claim that they're
Tom> common. But in a RIGHT JOIN situation, they need to be considered
Tom> as if they were regular keys. That would probably be sufficient
Tom> to dissuade the planner from choosing a hash join in this example.

I've now tried the attached patch to correct the bucketsize estimates,
and it does indeed stop the planner from considering the offending path
(in this case it just does the join the other way round).

One thing I couldn't immediately see how to do was account for the case
where there are a lot of nulls in the table but a strict qual (or an IS
NOT NULL) filters them out; this patch will be overly pessimistic about
such cases. Do estimates normally try and take things like this into
account? I didn't find any other relevant examples.

Tom> There may also be something we can do in the executor, but it
Tom> would take closer analysis to figure out what's going wrong. I
Tom> don't think kluging the behavior for NULL in particular is the
Tom> answer.

The point with nulls is that a hash value of 0 is currently special in
two distinct ways: it's always in batch 0 and bucket 0 regardless of how
many batches and buckets there are, and it's the result of hashing a
null. These two special cases interact in a worst-case manner, so it
seems worthwhile to avoid that.

--
Andrew (irc:RhodiumToad)

Attachment Content-Type Size
hjfix2.patch text/x-patch 5.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-02-16 00:21:55 Re: Replication identifiers, take 4
Previous Message Robert Haas 2015-02-15 23:47:41 Re: Manipulating complex types as non-contiguous structures in-memory