Re: small table left outer join big table

From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Jie Li <jay23jack(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: small table left outer join big table
Date: 2010-12-28 15:09:51
Message-ID: AANLkTi=_==09etYitq0Aiysa2BPScWZcq4WSq=mwm-3t@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 28, 2010 at 5:13 AM, Jie Li <jay23jack(at)gmail(dot)com> wrote:

> Hi,
>
> Please see the following plan:
>
> postgres=# explain select * from small_table left outer join big_table
> using (id);
> QUERY PLAN
>
>
> ----------------------------------------------------------------------------
> Hash Left Join (cost=126408.00..142436.98 rows=371 width=12)
> Hash Cond: (small_table.id = big_table.id)
> -> Seq Scan on small_table (cost=0.00..1.09 rows=9 width=8)
> -> Hash (cost=59142.00..59142.00 rows=4100000 width=8)
> -> Seq Scan on big_table (cost=0.00..59142.00 rows=4100000
> width=8)
> (5 rows)
>
> Here I have a puzzle, why not choose the small table to build hash table?
> It can avoid multiple batches thus save significant I/O cost, isn't it?
>
> We can perform this query in two phases:
> 1) inner join, using the small table to build hash table.
> 2) check whether each tuple in the hash table has matches before, which can
> be done with another flag bit
>
> The only compromise is the output order, due to the two separate phases.
> Not sure whether the SQL standard requires it.
>
>
SQL standard does not require the result to be in any particular order
unless an ORDER BY is used.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Colson 2010-12-28 15:10:13 page compression
Previous Message Magnus Hagander 2010-12-28 15:04:56 Re: pg_primary_conninfo