| From: | "Jackson, DeJuan" <djackson(at)cpsgroup(dot)com> | 
|---|---|
| To: | Joerg Schultz <Joerg(dot)Schultz(at)EMBL-Heidelberg(dot)de>, pgsql-general(at)postgreSQL(dot)org | 
| Subject: | RE: [GENERAL] cost difference for 2 similar queries | 
| Date: | 1998-06-15 21:53:57 | 
| Message-ID: | F10BB1FAF801D111829B0060971D839F2DCCDD@cpsmail | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
> Hi,
> 
> I have 2 queries, which are quite similar:
> 
> a) select distinct p1.domain from dom_prot p1, identical p2,
> classification p3
>    where p2.reference = p1.protein
>    and p2.protein ~ 'EPA3_HUMAN'
>    and p3.domain = p1.domain;
> 
> b) select distinct p1.domain from dom_prot p1, identical p2,
> classification p3
>    where p2.reference = p1.protein
>    and p2.protein ~ 'EPA3_HUMAN'
>    and p3.domain = p1.domain
>    and p3.class = 'extra';
Not sure (this may take longer):
SELECT DISTINCT p1.domain
  FROM dom_prot AS p1 
 WHERE EXISTS (SELECT p2.reference
                 FROM identical AS p2
                WHERE p2.reference = p1.protein AND
                      p2.protein ~ 'EPA3_HUMAN') AND
       EXISTS (SELECT p3.domain
                 FROM classification AS p3
                WHERE p3.domain = p1.domain AND
                      p3.class = 'extra');
I'd select bak the first field in thew subselects from p2 and p3
respectfully.
> As you see, they differ only in the last line. The problem is, that
> the time
> they need differes tremendously. Query a) takes about 5 sec, query b)
> 2 min !!
> I don't understand why this last constraint over an indexed field
> leads to such
> a difference.
> 
> Here is what explain says:
> 
> a)
> NOTICE:  QUERY PLAN:
> 
> Unique  (cost=5193.21 size=0 width=0)
>   ->  Sort  (cost=5193.21 size=0 width=0)
>         ->  Nested Loop  (cost=5193.21 size=1 width=48)
>               ->  Nested Loop  (cost=5191.16 size=1 width=36)
>                     ->  Seq Scan on p2  (cost=5189.11 size=1 width=12)
>                     ->  Index Scan on p1  (cost=2.05 size=62250
> width=24)
>               ->  Index Scan on p3  (cost=2.05 size=189 width=12)
> EXPLAIN
> 
> b)
> NOTICE:  QUERY PLAN:
> 
> Unique  (cost=680.60 size=0 width=0)
>   ->  Sort  (cost=680.60 size=0 width=0)
>         ->  Nested Loop  (cost=680.60 size=1 width=48)
>               ->  Nested Loop  (cost=4.10 size=330 width=36)
>                     ->  Index Scan on p3  (cost=2.05 size=1 width=12)
>                     ->  Index Scan on p1  (cost=2.05 size=62250
> width=24)
>               ->  Index Scan on p2  (cost=2.05 size=1 width=12)
> 
> EXPLAIN
> 
> 
> Dou you have any idea, how to improve query b) ??
> 
> Thanx,
> Joerg
> 
Let me know how it performs (or if I made any typoes),
		-DEJ
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Byron Nikolaidis | 1998-06-15 22:24:42 | NEW ODBC DRIVER v.0246 | 
| Previous Message | Justin Hickey | 1998-06-15 16:26:06 | Y2K compliant |