Skip site navigation (1) Skip section navigation (2)

Arrays and index scan

From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: PostgreSQL Performance List <pgsql-performance(at)postgresql(dot)org>
Subject: Arrays and index scan
Date: 2006-04-28 14:46:27
Message-ID: 44522AC3.6030707@logix-tt.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hello,

I'm searching for a comfortable way to get a variable-size bunch of user
specified Objects via a single prepared statement, so I wanted to submit
an ARRAY.

However, the query planner seems to refuse to make index scans even with
8.1:

testdb=# EXPLAIN SELECT * from streets WHERE link_id = ANY(ARRAY[1,2,3]);
                             QUERY PLAN
--------------------------------------------------------------------
 Seq Scan on streets  (cost=0.00..288681.74 rows=1713754 width=393)
   Filter: (link_id = ANY ('{1,2,3}'::integer[]))
(2 rows)



Via IN, it works fine, but hast the disadvantage that we cannot use
prepared statements effectively:

testdb=# explain select * from streets where link_id in (1,2,3);
                                          QUERY PLAN

-----------------------------------------------------------------------------------------------
 Bitmap Heap Scan on streets  (cost=6.02..16.08 rows=5 width=393)
   Recheck Cond: ((link_id = 1) OR (link_id = 2) OR (link_id = 3))
   ->  BitmapOr  (cost=6.02..6.02 rows=5 width=0)
         ->  Bitmap Index Scan on streets_link_id_idx  (cost=0.00..2.01
rows=2 width=0)
               Index Cond: (link_id = 1)
         ->  Bitmap Index Scan on streets_link_id_idx  (cost=0.00..2.01
rows=2 width=0)
               Index Cond: (link_id = 2)
         ->  Bitmap Index Scan on streets_link_id_idx  (cost=0.00..2.01
rows=2 width=0)
               Index Cond: (link_id = 3)
(9 rows)


And on the net, I found a nice trick via an "array flattening" function,
which at least uses a nested loop of index scans instead of an index
bitmap scan:

testdb=# CREATE FUNCTION flatten_array(anyarray) RETURNS SETOF
anyelement AS
testdb-# 'SELECT ($1)[i] FROM (SELECT
generate_series(array_lower($1,1),array_upper($1,1)) as i) as foo;'
testdb-# language SQL STRICT IMMUTABLE;


testdb=# EXPLAIN SELECT * from streets JOIN flatten_array(ARRAY[1,2,3])
on flatten_array=link_id;
                                            QUERY PLAN

--------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..5882.15 rows=1566 width=397)
   ->  Function Scan on flatten_array  (cost=0.00..12.50 rows=1000 width=4)
   ->  Index Scan using treets_link_id_idx on streets  (cost=0.00..5.84
rows=2 width=393)
         Index Cond: ("outer".flatten_array = streets.link_id)
(4 rows)


Currently, we're planning to use the array flattening approach, but are
there any plans to enhance the query planner for the direct ARRAY approach?

Thanks,
Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2006-04-28 14:57:00
Subject: Re: CPU usage goes to 100%, query seems to ran forever
Previous:From: Bealach-na BoDate: 2006-04-28 11:41:06
Subject: Re: Why so slow?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group