Skip site navigation (1) Skip section navigation (2)

Re: Join with an array

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Join with an array
Date: 2006-02-23 12:02:50
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers

have you seen contrib/intarray ?

On Thu, 23 Feb 2006, Markus Schiltknecht wrote:

> Hi,
> I'm trying to speed up a query with a lookup table. This lookup table
> gets very big and should still fit into memory. It does not change very
> often. Given these facts I decided to use an array, as follows:
> CREATE TABLE lookup_table (id INT PRIMARY KEY, items INT[] NOT NULL);
> I know this is not considered good database design, but it saves a lot
> of overhead for tuple visibility compared to a 1:1 table.
> To fetch an item via the lookup_table I tried to use the following
> query:
> SELECT, i.title FROM item i
> 	JOIN lookup_table lut ON = ANY(lut.items)
> Unfortunately that one seems to always use a sequential scan over items.
> As the items array in the lookup table often has only 3 - 10 entries
> (compared to about 1 mio rows in the item table) this is a very
> expensive operation.
> I tried to circumvent the problem with generate_series:
> SELECT, i.title FROM generate_series(0, $MAX) s
> 	JOIN lookup_table lut ON s = ANY(lut.items)
> 	JOIN item i ON s =
> That query uses the index to lookup the item, but as soon as $MAX gets
> bigger than 10000 generate_series takes too long and too many
> comparisons s = ANY(lut.items) need to be done.
> I think it would be possible to write a function generate_series(INT[])
> which returns all the elements of the array. The query would then look
> something like:
> SELECT, i.title
> 	FROM generate_series(SELECT lut.items FROM lookup_table lut WHERE
> = $LOOKUP_ID) s
> 	JOIN item i ON s =;
> Do you see any problem in implementing such function? Does something
> similar already existt?
> Why does the first query use a seqscan instead of the index on items? Do
> I miss anything? What problems do I face if I want to teach the planner
> to use the index in the first query [1]?
> Regards
> Markus
> [1]: generally in most cases like "JOIN .. ON x IN ANY($ARRAY)" where
> $ARRAY is reasonably small.
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match

Oleg Bartunov, Research Scientist, Head of AstroNet (,
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su,
phone: +007(495)939-16-83, +007(495)939-23-83

In response to


pgsql-hackers by date

Next:From: Markus SchiltknechtDate: 2006-02-23 12:14:03
Subject: Re: Join with an array
Previous:From: Martijn van OosterhoutDate: 2006-02-23 11:44:47
Subject: Re: Join with an array

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group