Re: JOIN vs. LEFT JOIN

From: "Nico Callewaert" <callewaert(dot)nico(at)telenet(dot)be>
To: "Andreas Wenk" <a(dot)wenk(at)netzmeister-st-pauli(dot)de>
Cc: <pgsql-novice(at)postgresql(dot)org>
Subject: Re: JOIN vs. LEFT JOIN
Date: 2009-01-28 11:59:35
Message-ID: 369802F769184EE18547C348838B7A57@etsinformatics.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

>>> Nico Callewaert schrieb:
>>>> Hi !
>>>>
>>>> I heard that it is always better to use a full JOIN rather than a LEFT
>>>> JOIN, for performance reasons. Is that correct ?
>>>
>>> please note that a JOIN and a LEFT JOIN are tow different things.
>>> Let's assume you have
>>> two tables (without any sens):
>>>
>>> books:
>>>
>>> headline | content
>>> - ----------+-------------
>>> politics | russia
>>> politics | middle east
>>> politics | germany
>>>
>>> interests:
>>>
>>> content
>>> - ---------
>>> germany
>>>
>>> Then fire two select statements like this:
>>>
>>> #1:
>>> SELECT a.headline,a.content,b.content as contentb
>>> FROM books a
>>> JOIN interests b ON a.content = b.content;
>>>
>>> headline | content | contentb
>>> - ----------+---------+----------
>>> politics | germany | germany
>>> (1 row)
>>>
>>> #2:
>>> SELECT a.headline,a.content,b.content as contentb
>>> FROM books a
>>> LEFT JOIN interests b on a.content = b.content;
>>> headline | content | contentb
>>> - ----------+-------------+----------
>>> politics | russia |
>>> politics | middle east |
>>> politics | germany | germany
>>> (3 rows)
>>>
>>>> But it's barely possible to use full JOINS all the time, since most of
>>>> the lookup fields are not required.
>>>> Example : in the customer table, the customer type is a looup field to
>>>> a
>>>> customer_type table. But that is not required to post the record. So
>>>> I
>>>> was thinking to create a record in the customer_type table with ID =
>>>> -1. And in case the customer type is NULL, to assign it the value -1.
>>>> That way, it will be always possible to do a full JOIN. I was
>>>> wondering
>>>> if that is good practice or rather nonsense ?
>>>
>>> Hm concerning the functionality of LEFT JOIN I do not really
>>> understand what you wanna do
>>> here. You created the customer_type table to have the possibility to
>>> give more types to
>>> one customer (1:n). But therefore you do not need a LEFT JOIN. The
>>> statement could be like:
>>>
>>> SELECT name a
>>> FROM customer a, cutomer_type b
>>> WHERE a.id = b.id
>>> AND b.type = 'super customer'
>>>
>>> Or not?
>>>
>>> Cheers
>>>
>>> Andy
>>
>>
>> Hi Andreas,
>>
>> Thanks for the reply. I know the difference between JOIN and LEFT JOIN.
>> The question was just if there is a performance hit between the 2 ?
>>
>> I never use constructs like this : SELECT name a
>>> FROM customer a, cutomer_type b
>>> WHERE a.id = b.id
>>> AND b.type = 'super customer'
>>
>> Always Joins.
>> I will try to clarify. The current situation is : the customer_type has
>> no value, so = NULL, no problem with LEFT JOIN.
>
> Why do you need an empty entry in this table? Your application should take
> care, that this
> does not happen ...
>
>> But I'm wondering with tables that has thousands of records, if LEFT
>> JOINS are performing well ?
>
> EXPLAIN ANALYZE with my examples:
>
> #1
> test=# explain analyze select a.headline,a.content,b.content as contentb
> from books a join
> interests b on a.content = b.content;
> QUERY PLAN
>
> -
> ------------------------------------------------------------------------------------------------------------------
> Hash Join (cost=10.88..23.67 rows=70 width=1548) (actual
> time=0.051..0.058 rows=1 loops=1)
> Hash Cond: (("outer".content)::text = ("inner".content)::text)
> -> Seq Scan on interests b (cost=0.00..11.40 rows=140 width=516)
> (actual
> time=0.007..0.010 rows=1 loops=1)
> -> Hash (cost=10.70..10.70 rows=70 width=1032) (actual
> time=0.026..0.026 rows=3 loops=1)
> -> Seq Scan on books a (cost=0.00..10.70 rows=70 width=1032)
> (actual
> time=0.004..0.013 rows=3 loops=1)
> Total runtime: 0.094 ms
> (6 rows)
>
>
> #2
> test=# explain analyze select a.headline,a.content,b.content as contentb
> from books a
> left join interests b on a.content = b.content;
> QUERY PLAN
>
> -
> ----------------------------------------------------------------------------------------------------------------------
> Hash Left Join (cost=11.75..23.72 rows=70 width=1548) (actual
> time=0.038..0.062 rows=3
> loops=1)
> Hash Cond: (("outer".content)::text = ("inner".content)::text)
> -> Seq Scan on books a (cost=0.00..10.70 rows=70 width=1032) (actual
> time=0.007..0.014 rows=3 loops=1)
> -> Hash (cost=11.40..11.40 rows=140 width=516) (actual
> time=0.015..0.015 rows=1 loops=1)
> -> Seq Scan on interests b (cost=0.00..11.40 rows=140 width=516)
> (actual
> time=0.004..0.007 rows=1 loops=1)
> Total runtime: 0.102 ms
> (6 rows)
>
> I fired each statement ten times. But seriously - this is more or less
> just a hint that a
> LEFT JOIN is less performant tahn a JOIN. To get that exactly it depend on
> which execution
> plan the planner is using. So the best way ist to use EXPALIN ANALYZE with
> your statements.
>
> Sorry for the big post with much output ...
>
> Cheers
>
> Andy
>

Hi Andy,

Thanks again for the fast reply !

> Why do you need an empty entry in this table? Your application should take
> care, that this
> does not happen ...

That is the whole point of my doubts. When a user creates a new customer,
it's not an obligation to enter a customer_type, and still many other
fields, like tax, payment, yellow pages, payment method, etc... related
things. So I always have lots of LEFT JOIN's in my queries, because of the
NULL fields. You said the application should handle it. So you mean, even
the field is not obligatory, you would put a value on it ? All lookup
tables are of course linked through foreign keys to enforce data integrity.
So what value would you enter ? Like 0, or -1 ?? And make sure there is a
0, or -1 record in the lookup table ? Otherwise the foreign key will
complain.
I think I almost get your point, just need a last little push :-)

Thanks a lot !
Nico

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Andreas Wenk 2009-01-28 12:12:54 Re: JOIN vs. LEFT JOIN
Previous Message Andreas Wenk 2009-01-28 11:46:57 Re: JOIN vs. LEFT JOIN