Re: subquery returning array

From: Richard Huxton <dev(at)archonet(dot)com>
To: Zeljko Vrba <zvrba(at)ifi(dot)uio(dot)no>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: subquery returning array
Date: 2005-05-23 13:40:55
Message-ID: 4291DD67.8030401@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Zeljko Vrba wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi! I have posted this as a user comment in section 9.17:
>
> - ----
> = ANY(array expression) doesn't work in the obvious way when the array
> expression is a subselect. For example:
>
> select * from stat3 where stat3.id = any ('{4,5,6,7}');
>
> works (and returns the expected tuples). However,
>
> select * from stat3 where stat3.id = any (select stat3 from
> helix_request where id=11);
>
> DOESN'T work and complains with an error: operator does not exist:
> integer = integer[]. The inner select returns EXACTLY ONE value, namely
> the same array as the literal in the first example.

No, it doesn't - hence the error.

What it returns is a SET containing ONE TUPLE which contains ONE ARRAY.
Here the "any" ranges over the SET not the array, so it tries to compare
the outer integer with the inner array of integers.

> The solution is:
>
> select * from stat3 where (select stat3.id = any (stat3) from
> helix_request where id=11);
>
> I thank to Chris Kings-Lynne ("KL") for helping me out with this over IRC.
> - ----
>
> KL suggested to mail this question to the hackers list. The problem with
> this solution is that postgresql uses sequential scan for the proposed
> solution:
[snip]
> I'm going to have MANY queries of this kind and having sequential scan
> for each such query will lead to quadratic performance - unacceptable
> for the amount of data I will have (millions of rows).
>
> Is there yet another way of making WHERE field = ANY (subselect
> returning an array) work? Or make postgres to use index?
>
> OK, I know the suggestion from the manual: usually it's bad database
> design searching through arrays and a separate table is better. however,
> to convert {4,5,6,7} into a table would require a table with two
> columns, e.g. (1,4), (1,5), (1,6), (1,7) where the first column would be
> used to identify elements in the single array). this additional column
> is a waste of space.

You're trading space for time, and deliberately spoiling your design to
do so. If you're going to want to access each value separately, design
your system to reflect that fact.

Have you actually tested your example with a proper two-column table? It
might be that weaknesses in PostgreSQL force you to compromise your
design, but I wouldn't start from that point. Try a proper relational
design, with a bit of tuning and see if that will work for you.

If you really want to use arrays, perhaps see if the contrib/intarray
module can help you.

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2005-05-23 13:58:58 Re: PATCH to allow concurrent VACUUMs to not lock each
Previous Message --= Tono =-- 2005-05-23 13:15:18 INSTEAD OF trigger on VIEWs