Quick Links

Re: Ambigous Plan - Larger Table on Hash Side

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Narendra Pradeep U U <narendra(dot)pradeep(at)zohocorp(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Ambigous Plan - Larger Table on Hash Side
Date:	2018-03-12 17:43:14
Message-ID:	20180312174314.fehtgox5qr4lfqp6@alap3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2018-03-12 12:52:00 -0400, Tom Lane wrote:
> Narendra Pradeep U U <narendra(dot)pradeep(at)zohocorp(dot)com> writes:
> > Recently I came across a case where the planner choose larger table on hash side. I am not sure whether it is an intended behavior or we are missing something.
>
> Probably the reason is that the smaller table has a less uniform
> distribution of the hash key. You don't want to hash with a nonuniform
> distribution of the hashtable key; if many keys go into the same bucket
> then performance degrades drastically.

Not sure I follow. Unless the values are equivalent (i.e. duplicate key
values), why should non-uniformity in key space translate to hash space?
And if there's duplicates it shouldn't hurt much either, unless doing
a semi/anti-join? All rows are going to be returned and IIRC we quite
cheaply continue a bucket scan?

Greetings,

Andres Freund

In response to

Re: Ambigous Plan - Larger Table on Hash Side at 2018-03-12 16:52:00 from Tom Lane

Responses

Re: Ambigous Plan - Larger Table on Hash Side at 2018-03-12 18:06:51 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2018-03-12 17:56:24	Re: CURRENT OF causes an error when IndexOnlyScan is used
Previous Message	stalkthetiger	2018-03-12 17:21:59	Re: All Taxi Services need Index Clustered Heap Append