Re: Subqueries and the optimizer

From: "Dean Gibson (DB Administrator)" <dba-sql(at)ultimeth(dot)net>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Subqueries and the optimizer
Date: 2003-05-20 21:11:03
Message-ID: 5.1.0.14.2.20030520140632.02d0a530@imaps.ultimeth.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I had the same problem (VERY SLOW when using a WHERE xxx IN (SELECT ...)), but fortunately, a join solved my problem. However, the WHERE xxx IN (SELECT ...) syntax is often more natural. I'm also waiting for 7.4.

-- Dean

Dmitri Bichko wrote on 2003-05-20 13:45:
>I wish it were as easy as a join - the query is much simplified for the purpose of the example, in reality the subselect is more complicated and includes a GROUP BY (which, at least as far as I know, makes subqueries the only way of doing this).
>
>Thanks anway, guess I'll wait for 7.4 with this (and just split them up into two queries for the time being),
>Dmitri
>
>-----Original Message-----
>From: Dean Gibson (DB Administrator) [mailto:dba-sql(at)ultimeth(dot)net]
>Sent: Tuesday, May 20, 2003 2:28 PM
>To: pgsql-general(at)postgresql(dot)org
>Subject: Re: [GENERAL] Subqueries and the optimizer
>
>
>Try:
>
>SELECT blast_id FROM genes, ll_out_mm WHERE blast_batch_id = 2 AND
>genes.locus_id = ll_out_mm.locus_id;
>
>Using more recent versions of PostgreSQL, you can also write:
>
>SELECT blast_id FROM genes JOIN ll_out_mm USING ( locus_id) WHERE
>blast_batch_id = 2;
>
>-- Dean
>
>Dmitri Bichko wrote on 2003-05-20 10:50:
>>So, I have a table with an index:
>>
>>dev=# explain SELECT blast_id FROM genes WHERE blast_batch_id = 2 AND
>>locus_id IN (1,2);
>> QUERY PLAN
>>-----------------------------------------------------------------------
>-
>>------------------------------
>> Index Scan using idx_genes_locus_id, idx_genes_locus_id on genes
>>(cost=0.00..88.21 rows=14 width=4)
>> Index Cond: ((locus_id = 1) OR (locus_id = 2))
>> Filter: (blast_batch_id = 2)
>>
>>So far so good, but when I try it with a subquery:
>>
>>dev=# explain SELECT blast_id FROM genes WHERE blast_batch_id = 2 AND
>>locus_id IN (SELECT locus_id FROM ll_out_mm);
>>QUERY PLAN
>>-----------------------------------------------------------------------
>-
>> Seq Scan on genes (cost=0.00..21414353.48 rows=11003 width=4)
>> Filter: ((blast_batch_id = 2) AND (subplan))
>> SubPlan
>> -> Seq Scan on ll_out_mm (cost=0.00..1267.64 rows=59264 width=4)
>>
>>
>>How can I nudge the optimizer in the direction of using the index in
>the
>>second case as well? Or is it supposed to be doing this in this case.
>>
>>Thanks,
>>Dmitri
>>
>>---------------------------(end of
>broadcast)---------------------------
>>TIP 3: if posting/reading through Usenet, please send an appropriate
>>subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>>message can get through to the mailing list cleanly
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org

Browse pgsql-general by date

  From Date Subject
Next Message Rod Taylor 2003-05-20 21:14:31 Re: sequence caches
Previous Message Dave Krieger 2003-05-20 21:05:07 Building 7.1.3 on Solaris 2.6