From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Hartranft, Robert M(dot) (GSFC-423(dot)0)[RAYTHEON CO]" <robert(dot)m(dot)hartranft(at)nasa(dot)gov> |
Cc: | "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: question on hash joins |
Date: | 2017-10-18 21:17:06 |
Message-ID: | 22675.1508361426@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
"Hartranft, Robert M. (GSFC-423.0)[RAYTHEON CO]" <robert(dot)m(dot)hartranft(at)nasa(dot)gov> writes:
> explain select count(1) from table1 g join table2 x on x.granuleid = g.granuleid where g.collectionid = 22467;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=18200480.80..18200480.81 rows=1 width=8)
> -> Hash Join (cost=103206.82..18190602.43 rows=3951347 width=0)
> Hash Cond: (x.granuleid = g.granuleid)
> -> Seq Scan on table2 x (cost=0.00..10596253.01 rows=644241901 width=8)
> -> Hash (cost=92363.72..92363.72 rows=660888 width=8)
> -> Index Only Scan using idx_table1 on table1 g (cost=0.57..92363.72 rows=660888 width=8)
> Index Cond: (collectionid = '22467'::bigint)
> (7 rows)
> My question is, what gets put into the Hash?
> I assume the with "width=8" must refer to the size of the key.
It's the data payload width, but there's some per-row overhead too, which
is more than 8 bytes ... don't recall how much more at the moment.
> The reason I ask is because, when I try to run the query it fails due to
> temp file use over 10GB.
What have you got work_mem set to? If the hashed table exceeds that,
you're going to end up with a multi-batch join, in which case most of
the outer table is going to get written into temp files as well.
I don't think it's a good idea to have such a small temp_file_limit
when you're working with such huge tables. Either a merge or hash
join is likely to require temp file space similar to the table size,
and you do *not* want a nestloop join, at least not if you want your
answers in reasonable time.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Hartranft, Robert M. (GSFC-423.0)[RAYTHEON CO] | 2017-10-19 13:09:42 | Re: question on hash joins |
Previous Message | Hartranft, Robert M. (GSFC-423.0)[RAYTHEON CO] | 2017-10-18 20:00:14 | question on hash joins |