Re: Join problem

From: Suller András <suan2(at)freemail(dot)hu>
To: Silke Trissl <trissl(at)informatik(dot)hu-berlin(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join problem
Date: 2004-04-28 08:15:19
Message-ID: 408F6817.1060708@freemail.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

these two queries are not equal. Query1 returns 6 rows, query2 returns 0
rows, because '~*' and '=' operators are not same. BTW when you use '=',
it could use index on "item.description".
On query1, "Seq Scan on item" estimates 1 row, on query2 it estimates
733 rows. IMHO that's why query1 uses nested loop, query2 uses hash join.

bye,
Suller Andras

Silke Trissl írta:

> Hi,
>
> Query 1:
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------
>
> Nested Loop (cost=0.00..28836.75 rows=1 width=4) (actual
> time=65350.780..452130.702 rows=6 loops=1)
> Join Filter: ("inner".order_id = "outer".order_id)
> -> Seq Scan on item (cost=0.00..28814.24 rows=1 width=8) (actual
> time=33.180..1365.190 rows=716 loops=1)
> Filter: (description ~* 'CD'::text)
> -> Seq Scan on orders (cost=0.00..22.50 rows=1 width=4) (actual
> time=21.644..629.500 rows=18 loops=716)
> Filter: ((order_name)::text ~* 'Smith'::text)
> Total runtime: 452130.782 ms
> ###########################################################################
>
>
> Query 2:
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------
>
> Hash Join (cost=22.50..28840.44 rows=4 width=4) (actual
> time=1187.798..1187.798 rows=0 loops=1)
> Hash Cond: ("outer".order_id = "inner".order_id)
> -> Seq Scan on item (cost=0.00..28814.24 rows=733 width=8)
> (actual time=542.737..542.737 rows=0 loops=1)
> Filter: (description = 'CD'::text)
> -> Hash (cost=22.50..22.50 rows=1 width=4) (actual
> time=645.042..645.042 rows=0 loops=1)
> -> Seq Scan on orders (cost=0.00..22.50 rows=1 width=4)
> (actual time=22.373..644.996 rows=18 loops=1)
> Filter: ((order_name)::text ~* 'Smith'::text)
> Total runtime: 1187.865 ms
> ############################################################################
>
>
>
> Query 1 with 'set enable_nestloop to false'
>
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------
>
> Hash Join (cost=22.50..28836.75 rows=1 width=4) (actual
> time=1068.593..2003.330 rows=6 loops=1)
> Hash Cond: ("outer".item_id = "inner".item_id)
> -> Seq Scan on item (cost=0.00..28814.24 rows=1 width=8) (actual
> time=33.347..1357.073 rows=716 loops=1)
> Filter: (description ~* 'CD'::text)
> -> Hash (cost=22.50..22.50 rows=1 width=4) (actual
> time=645.287..645.287 rows=0 loops=1)
> -> Seq Scan on orders (cost=0.00..22.50 rows=1 width=4)
> (actual time=22.212..645.239 rows=18 loops=1)
> Filter: ((order_name)::text ~* 'CD'::text)
> Total runtime: 2003.409 ms

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Manfred Koizar 2004-04-28 08:24:41 Re: Simply join in PostrgeSQL takes too long
Previous Message Edoardo Ceccarelli 2004-04-28 08:13:14 Re: [JDBC] [PERFORM] is a good practice to create an index on the