Skip site navigation (1) Skip section navigation (2)

Which is better, correlated subqueries or joins?

From: Jeffrey Tenny <jeffrey(dot)tenny(at)comcast(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Which is better, correlated subqueries or joins?
Date: 2005-05-19 21:50:14
Message-ID: 428D0A16.2010709@comcast.net (view raw or flat)
Thread:
Lists: pgsql-performance
Looking for some general advice on correlated subqueries vs. joins.

Which of these plans is likely to perform better.  One table is a master 
record table for entities and their IDs (nv_products), the other 
represents a transitive closure of parent/child relationships (for a 
tree) of ID's in the master record table (and so is larger) 
(ssv_product_children).

The query is, in english: for direct children of an ID, return the ones 
for which isrel is true.

I have only a tiny demo table set for which there is only one record 
matched by the queries below, it's hard to guess at how deep or branchy 
a production table might be, so I'm trying to develop a general query 
strategy and learn a thing or two about pgsql.


Here's the join:

# explain select child_pid from ssv_product_children, nv_products where 
nv_products.id = ssv_product_children.child_pid and 
ssv_product_children.pid = 1 and nv_products.isrel = 't';
                                 QUERY PLAN
--------------------------------------------------------------------------
  Hash Join  (cost=1.22..2.47 rows=2 width=8)
    Hash Cond: ("outer".child_pid = "inner".id)
    ->  Seq Scan on ssv_product_children  (cost=0.00..1.18 rows=9 width=4)
          Filter: (pid = 1)
    ->  Hash  (cost=1.21..1.21 rows=4 width=4)
          ->  Seq Scan on nv_products  (cost=0.00..1.21 rows=4 width=4)
                Filter: (isrel = true)
(7 rows)


Here's the correlated subquery:


# explain select child_pid from ssv_product_children where pid = 1 and 
child_pid = (select nv_products.id from nv_products where nv_products.id 
= child_pid and isrel = 't');
                              QUERY PLAN
---------------------------------------------------------------------
  Seq Scan on ssv_product_children  (cost=0.00..18.78 rows=1 width=4)
    Filter: ((pid = 1) AND (child_pid = (subplan)))
    SubPlan
      ->  Seq Scan on nv_products  (cost=0.00..1.26 rows=1 width=4)
            Filter: ((id = $0) AND (isrel = true))
(5 rows)


Thanks for any advice.

Responses

pgsql-performance by date

Next:From: Joshua D. DrakeDate: 2005-05-19 21:53:52
Subject: Re: Which is better, correlated subqueries or joins?
Previous:From: Anjan DaveDate: 2005-05-19 20:19:58
Subject: Re: PostgreSQL strugling during high load

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group