From: | Zeljko Vrba <zvrba(at)ifi(dot)uio(dot)no> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | subquery returning array |
Date: | 2005-05-23 06:12:35 |
Message-ID: | 42917453.1000706@ifi.uio.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi! I have posted this as a user comment in section 9.17:
- ----
= ANY(array expression) doesn't work in the obvious way when the array
expression is a subselect. For example:
select * from stat3 where stat3.id = any ('{4,5,6,7}');
works (and returns the expected tuples). However,
select * from stat3 where stat3.id = any (select stat3 from
helix_request where id=11);
DOESN'T work and complains with an error: operator does not exist:
integer = integer[]. The inner select returns EXACTLY ONE value, namely
the same array as the literal in the first example.
The solution is:
select * from stat3 where (select stat3.id = any (stat3) from
helix_request where id=11);
I thank to Chris Kings-Lynne ("KL") for helping me out with this over IRC.
- ----
KL suggested to mail this question to the hackers list. The problem with
this solution is that postgresql uses sequential scan for the proposed
solution:
Seq Scan on stat3 (cost=0.00..40018.94 rows=3321 width=32) (actual
time=0.112..75.911 rows=4 loops=1)
Filter: (subplan)
SubPlan
-> Index Scan using helix_request_pkey on helix_request
(cost=0.00..6.01 rows=1 width=32) (actual time=0.007..0.008 rows=1
loops=6756)
Index Cond: (id = 11)
Total runtime: 76.040 ms
(6 rows)
If I write select * from stat3 where id in (4,5,6,7) and index scan is used:
Index Scan using stat3_pkey, stat3_pkey, stat3_pkey, stat3_pkey on
stat3 (cost=0.00..23.94 rows=4 width=32) (actual time=0.066..0.091
rows=4 loops=1)
Index Cond: ((id = 4) OR (id = 5) OR (id = 6) OR (id = 7))
Total runtime: 0.164 ms
I'm going to have MANY queries of this kind and having sequential scan
for each such query will lead to quadratic performance - unacceptable
for the amount of data I will have (millions of rows).
Is there yet another way of making WHERE field = ANY (subselect
returning an array) work? Or make postgres to use index?
OK, I know the suggestion from the manual: usually it's bad database
design searching through arrays and a separate table is better. however,
to convert {4,5,6,7} into a table would require a table with two
columns, e.g. (1,4), (1,5), (1,6), (1,7) where the first column would be
used to identify elements in the single array). this additional column
is a waste of space.
KL has mentioned a package for converting an array into a tuple. Where
can I find it? Also, what is the limit on the number of elements in the
IN (...) condition before the database resorts to sequential scan?
PS: Please reply also via e-mail as I'm not a regular subscriber of this
list.
Thanks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iQCVAwUBQpF0U2nN2B8CwPRmAQKN8gP+JCzLiX5b48kMYmHRwTSFZWN5Jydfw0iH
MABuYj2mKCY9Dgmd4pLK8Xlxhf/tEYzd3N2lcPFYf1vIXCSpbFasRrO3hJ4WjRLr
MZ6MLXCn59Y8wtd8Iz8uug47XuYVGSreZESKA9tRfl+u8t16nPC9nEMyDNDFIRI9
ZOnBMCnQzrY=
=5qmB
-----END PGP SIGNATURE-----
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2005-05-23 09:49:42 | Re: PATCH to allow concurrent VACUUMs to not lock each |
Previous Message | Yoshiyuki Asaba | 2005-05-23 02:05:26 | character type value is not padded with spaces |