Select in subselect vs select = any array

From: Adam Tistler <atistler(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Select in subselect vs select = any array
Date: 2011-03-20 06:47:15
Message-ID: 75F42287-42D3-4FD5-AFFA-64B1CE3C0195@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have noticed that SELECT ... = ANY(ARRAY(...)) is about twice as fast as SELECT IN ( ... ).
Can anyone explain a reason for this? Results are the bottom and are reproducible. I can test with other versions if that is necessary.

./configure --prefix=/usr/local/pgsql84 --with-openssl --with-perl
CentOS release 5.4 (Final)
psql (PostgreSQL) 8.4.1

prompt2=# select count(*) from nodes;
count
--------
754734
(1 row)

prompt2=# \d nodes
Table "public.nodes"
Column | Type | Modifiers
--------------+--------------------------+-----------------------------------------------------------
node_id | integer | not null default nextval(('node_id_seq'::text)::regclass)
node_type_id | integer | not null
template_id | integer | not null
timestamp | timestamp with time zone | default ('now'::text)::timestamp(6) with time zone
Indexes:
"nodes_pkey" PRIMARY KEY, btree (node_id)
"n_node_id_index" btree (node_id)
"n_node_type_id_index" btree (node_type_id)
"n_template_id_index" btree (template_id)

prompt2=# select count(*) from nodes where node_id = any( Array(select node_id from nodes limit 100000) );
count
--------
100000
(1 row)

Time: 404.530 ms
prompt2=# select count(*) from nodes where node_id = any( Array(select node_id from nodes limit 100000) );
count
--------
100000
(1 row)

Time: 407.316 ms
prompt2=# select count(*) from nodes where node_id = any( Array(select node_id from nodes limit 100000) );
count
--------
100000
(1 row)

Time: 408.728 ms
prompt2=# select count(*) from nodes where node_id in (select node_id from nodes limit 100000 );
count
--------
100000
(1 row)

Time: 793.840 ms
prompt2=# select count(*) from nodes where node_id in (select node_id from nodes limit 100000 );
count
--------
100000
(1 row)

Time: 779.137 ms
prompt2=# select count(*) from nodes where node_id in (select node_id from nodes limit 100000 );
count
--------
100000
(1 row)

Time: 781.820 ms

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2011-03-20 06:51:20 Re: Select in subselect vs select = any array
Previous Message bricklen 2011-03-19 16:58:32 Re: Fastest pq_restore?