Skip site navigation (1) Skip section navigation (2)

Re: CLUSTER and a problem

From: Andrzej Zawadzki <zawadaa(at)wp(dot)pl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: CLUSTER and a problem
Date: 2009-09-15 11:13:43
Message-ID: 4AAF76E7.9050608@wp.pl (view raw or flat)
Thread:
Lists: pgsql-performance
Andrzej Zawadzki wrote:
> Tom Lane wrote:
>   
>> Andrzej Zawadzki <zawadaa(at)wp(dot)pl> writes:
>>   
>>     
>>> # EXPLAIN ANALYZE SElect telekredytid from kredytyag
>>> WHERE TRUE
>>> AND kredytyag.id = 3064776
>>> AND NOT EXISTS
>>> (SELECT 1 FROM
>>> ( SELECT * FROM kredyty kr
>>> where telekredytid = 328650
>>> ORDER BY kr.datazaw DESC LIMIT 1 )
>>> kred where kred.bank = 2);
>>>     
>>>       
>> So this is the slow bit:
>>
>>   
>>     
>>>      ->  Subquery Scan kred  (cost=0.00..778.06 rows=1 width=0) (actual
>>> time=2045556.496..2045556.496 rows=0 loops=1)
>>>            Filter: (kred.bank = 2)
>>>            ->  Limit  (cost=0.00..778.05 rows=1 width=3873) (actual
>>> time=2045556.492..2045556.492 rows=0 loops=1)
>>>                  ->  Index Scan Backward using kredyty_datazaw on
>>> kredyty kr  (cost=0.00..1088490.39 rows=1399 width=3873) (actual
>>> time=2045556.487..2045556.487 rows=0 loops=1)
>>>                        Filter: (telekredytid = 328650)
>>>     
>>>       
>> It's doing a scan in descending datazaw order and hoping to find a row
>> that has both telekredytid = 328650 and bank = 2.  Evidently there isn't
>> one, so the indexscan runs clear to the end before it can report that the
>> NOT EXISTS is true.  Unfortunately, you've more or less forced this
>> inefficient query plan by wrapping some of the search conditions inside a
>> LIMIT and some outside.  Try phrasing the NOT EXISTS query differently.
>> Or, if you do this type of query a lot, a special-purpose index might be
>> worthwhile.  It would probably be fast as-is if you had an index on
>> (telekredytid, datazaw) (in that order).
>>   
>>     
> That's no problem - we already has changed this query:
> SELECT * FROM kredyty kr
>             where kr.telekredytid = 328652
>             and kr.bank = 2
>             AND NOT EXISTS (SELECT * from kredyty k2 WHERE k2.bank<>2
> and k2.creationdate > kr.creationdate)
> Works good.
>
> But in fact this wasn't my point.
> My point was: why operation CLUSTER has such a big and bad impact on
> planer for this query?
> Like I sad: before CLUSTER query was run in xx milliseconds :-)
>
>   
Before CLUSTER was:

# EXPLAIN ANALYZE SELECT telekredytid FROM kredytyag
WHERE TRUE
AND kredytyag.id = 3064776
AND NOT EXISTS
    (
    SELECT 1 FROM
        (
            SELECT * FROM kredyty kr
            where telekredytid = 328652
            ORDER BY kr.datazaw DESC LIMIT 1
    )
    kred where kred.bank = 2)
;
                                                                              
QUERY
PLAN                                                                            

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=1317.25..1325.55 rows=1 width=4) (actual
time=0.235..0.235 rows=0 loops=1)
   One-Time Filter: (NOT $0)
   InitPlan
     ->  Subquery Scan kred  (cost=1317.24..1317.25 rows=1 width=0)
(actual time=0.188..0.188 rows=0 loops=1)
           Filter: (kred.bank = 2)
           ->  Limit  (cost=1317.24..1317.24 rows=1 width=4006) (actual
time=0.172..0.172 rows=0 loops=1)
                 ->  Sort  (cost=1317.24..1320.27 rows=1212 width=4006)
(actual time=0.069..0.069 rows=0 loops=1)
                       Sort Key: kr.datazaw
                       Sort Method:  quicksort  Memory: 25kB
                       ->  Index Scan using kredyty_telekredytid_idx on
kredyty kr  (cost=0.00..1311.18 rows=1212 width=4006) (actual
time=0.029..0.029 rows=0 loops=1)
                             Index Cond: (telekredytid = 328652)
   ->  Index Scan using kredytyag_pkey on kredytyag  (cost=0.00..8.29
rows=1 width=4) (actual time=0.018..0.018 rows=0 loops=1)
         Index Cond: (id = 3064776)
 Total runtime: 1.026 ms
(14 rows)

and that's clear for me.
Probably bad index for CLUSTER - Investigating ;-)

-- 
Andrzej Zawadzki

In response to

pgsql-performance by date

Next:From: Gaël Le MignotDate: 2009-09-15 12:58:15
Subject: Problem with partitionning and orderby query plans
Previous:From: zz_11Date: 2009-09-15 10:09:54
Subject: Re: possible wrong query plan on pg 8.3.5,

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group