Performance difference between ANY and IN, also array syntax

From: Bart Grantham <bart(at)logicworks(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Performance difference between ANY and IN, also array syntax
Date: 2005-04-26 22:12:15
Message-ID: 426EBCBF.4080205@logicworks.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello, all. I work for a Manhattan ISP and have developed an internal
systems management/housekeeping app on php/postgres 7.4. I am trying to
speed up some bits with stored procedures and have had great success,
except I've now run into a bit of trouble. It comes down to this:

# SELECT * FROM connections WHERE connectee_node_id = ANY (
ARRAY[28543,28542] );
-snip-
Time: 564.899 ms

...versus...

# SELECT * FROM connections WHERE connectee_node_id IN ( 28543,28542 );
-snip-
Time: 1.410 ms

Why the difference? I tried explain:

# explain SELECT * FROM connections WHERE connectee_node_id = ANY (
ARRAY[28543,28542] );
QUERY PLAN
---------------------------------------------------------------------
Seq Scan on connections (cost=0.00..17963.44 rows=207264 width=33)
Filter: (connectee_node_id = ANY ('{28543,28542}'::integer[]))

..versus...

# explain SELECT * FROM connections WHERE connectee_node_id IN (
28543,28542 );
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Index Scan using c_connectee_node_id, c_connectee_node_id on
connections (cost=0.00..67.28 rows=72 width=33)
Index Cond: ((connectee_node_id = 28543) OR (connectee_node_id = 28542))

Why filter for one and index for the other? Is it because an array is
mutable, so it's impossible to operate on it the same way? I need to
pass in an array to my stored procedure, but having to use ANY in my
select is killing the performance. I'd like to know what I can do to
make ANY perform like IN, or alternatively, could someone fill me in on
the syntax on how to cast an INT[] to a "list of scalar expressions",
which the manual states is the right-hand side to IN.

Also, I have a few bits of confusion about array syntax that perhaps
someone could illuminate for me. Forgive me, I'm not as fluent in SQL
as other languages.

First, this doesn't work for me: RAISE NOTICE ''DEBUG: %'', _myarray[1];
It seems pretty reasonable to me, but it gives me a 'syntax error at or
near "["'.

Next, I can't seem to declare an array of a user-defined row: _innerrow
my_type%ROWTYPE[];
Is there a syntactical snag I'm tripping over? Or can you not have
arrays of other than built-in types?

I think it's super-cool that you can extract arbitrary rectangles of
data from a multi-dimentional array, but can you take a vertical slice
from an array of user-defined type by column? For example:
_mytype[1:5].some_column

And finally, how do you specifcy an entire array index when doing
columns from multi-dim arrays? Is there something like
_my_multidim_of_ints[*][4] or maybe _my_multidim_of_ints[:][4] ?

Thanks for the help, and thanks for the great database.

Bart G
Logicworks NOC

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2005-04-26 22:14:46 Re: blob storage
Previous Message Marc G. Fournier 2005-04-26 22:11:43 Re: OpenRPT