Arrays and index scan

From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: PostgreSQL Performance List <pgsql-performance(at)postgresql(dot)org>
Subject: Arrays and index scan
Date: 2006-04-28 14:46:27
Message-ID: 44522AC3.6030707@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

I'm searching for a comfortable way to get a variable-size bunch of user
specified Objects via a single prepared statement, so I wanted to submit
an ARRAY.

However, the query planner seems to refuse to make index scans even with
8.1:

testdb=# EXPLAIN SELECT * from streets WHERE link_id = ANY(ARRAY[1,2,3]);
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on streets (cost=0.00..288681.74 rows=1713754 width=393)
Filter: (link_id = ANY ('{1,2,3}'::integer[]))
(2 rows)

Via IN, it works fine, but hast the disadvantage that we cannot use
prepared statements effectively:

testdb=# explain select * from streets where link_id in (1,2,3);
QUERY PLAN

-----------------------------------------------------------------------------------------------
Bitmap Heap Scan on streets (cost=6.02..16.08 rows=5 width=393)
Recheck Cond: ((link_id = 1) OR (link_id = 2) OR (link_id = 3))
-> BitmapOr (cost=6.02..6.02 rows=5 width=0)
-> Bitmap Index Scan on streets_link_id_idx (cost=0.00..2.01
rows=2 width=0)
Index Cond: (link_id = 1)
-> Bitmap Index Scan on streets_link_id_idx (cost=0.00..2.01
rows=2 width=0)
Index Cond: (link_id = 2)
-> Bitmap Index Scan on streets_link_id_idx (cost=0.00..2.01
rows=2 width=0)
Index Cond: (link_id = 3)
(9 rows)

And on the net, I found a nice trick via an "array flattening" function,
which at least uses a nested loop of index scans instead of an index
bitmap scan:

testdb=# CREATE FUNCTION flatten_array(anyarray) RETURNS SETOF
anyelement AS
testdb-# 'SELECT ($1)[i] FROM (SELECT
generate_series(array_lower($1,1),array_upper($1,1)) as i) as foo;'
testdb-# language SQL STRICT IMMUTABLE;

testdb=# EXPLAIN SELECT * from streets JOIN flatten_array(ARRAY[1,2,3])
on flatten_array=link_id;
QUERY PLAN

--------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..5882.15 rows=1566 width=397)
-> Function Scan on flatten_array (cost=0.00..12.50 rows=1000 width=4)
-> Index Scan using treets_link_id_idx on streets (cost=0.00..5.84
rows=2 width=393)
Index Cond: ("outer".flatten_array = streets.link_id)
(4 rows)

Currently, we're planning to use the array flattening approach, but are
there any plans to enhance the query planner for the direct ARRAY approach?

Thanks,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-04-28 14:57:00 Re: CPU usage goes to 100%, query seems to ran forever
Previous Message Bealach-na Bo 2006-04-28 11:41:06 Re: Why so slow?