Quick Links

Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Scott Carey <scott(at)richrelevance(dot)com>
Cc:	"pgsql-performance(at)postgresql(dot)org Performance" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
Date:	2010-10-19 03:43:45
Message-ID:	1532.1287459825@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Scott Carey <scott(at)richrelevance(dot)com> writes:
> I consistently see HashJoin plans that hash the large table, and scan
> the small table.

Could we see a self-contained test case? And what cost parameters are
you using, especially work_mem?

> This is especially puzzling in some cases where I have 30M rows in the big table and ~ 100 in the small... shouldn't it hash the small table and scan the big one?

Well, size of the table isn't the only factor; in particular, a highly
nonuniform distribution of the key value will inflate the cost estimate
for using a table on the inner size of the hash. But the example you
show here seems a bit extreme.

regards, tom lane

In response to

HashJoin order, hash the large or small table? Postgres likes to hash the big one, why? at 2010-10-19 01:40:11 from Scott Carey

Responses

Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why? at 2010-10-19 09:00:11 from Scott Carey

Browse pgsql-performance by date

	From	Date	Subject
Next Message	AI Rumman	2010-10-19 07:18:45	Re: how to get the total number of records in report
Previous Message	Jon Nelson	2010-10-19 02:47:08	Re: unexpected query failure: ERROR: GIN indexes do not support whole-index scans