Re: Re-Reason of Slowness of Query

From: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Re-Reason of Slowness of Query
Date: 2011-03-23 11:21:38
Message-ID: 4D89D7C2.3040307@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

23.03.11 13:21, Adarsh Sharma ???????(??):
> Thank U all, for U'r Nice Support.
>
> Let me Conclude the results, below results are obtained after finding
> the needed queries :
>
> *First Option :
>
> *pdc_uima=# explain analyze select distinct(p.crawled_page_id)
> pdc_uima-# from page_content p left join clause2 c on (p.crawled_page_id =
> pdc_uima(# c.source_id) where (c.source_id is null);
>
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=100278.16..104104.75 rows=382659 width=8)
> (actual time=87927.000..87930.084 rows=72 loops=1)
> -> Nested Loop Anti Join (cost=0.00..99320.46 rows=383079
> width=8) (actual time=0.191..87926.546 rows=74 loops=1)
> -> Seq Scan on page_content p (cost=0.00..87132.17
> rows=428817 width=8) (actual time=0.027..528.978 rows=428467 loops=1)
> -> Index Scan using idx_clause2_source_id on clause2 c
> (cost=0.00..18.18 rows=781 width=4) (actual time=0.202..0.202 rows=1
> loops=428467)
> Index Cond: (p.crawled_page_id = c.source_id)
> Total runtime: 87933.882 ms:-(
> (6 rows)
>
> *Second Option :
>
> *pdc_uima=# explain analyze select distinct(p.crawled_page_id) from
> page_content p
> pdc_uima-# where NOT EXISTS (select 1 from clause2 c where
> c.source_id = p.crawled_page_id);
>
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=100278.16..104104.75 rows=382659 width=8)
> (actual time=7047.259..7050.261 rows=72 loops=1)
> -> Nested Loop Anti Join (cost=0.00..99320.46 rows=383079
> width=8) (actual time=0.039..7046.826 rows=74 loops=1)
> -> Seq Scan on page_content p (cost=0.00..87132.17
> rows=428817 width=8) (actual time=0.008..388.976 rows=428467 loops=1)
> -> Index Scan using idx_clause2_source_id on clause2 c
> (cost=0.00..18.18 rows=781 width=4) (actual time=0.013..0.013 rows=1
> loops=428467)
> Index Cond: (c.source_id = p.crawled_page_id)
> Total runtime: 7054.074 ms :-)
> (6 rows)
>

Actually the plans are equal, so I suppose it depends on what were run
first :). Slow query operates with data mostly on disk, while fast one
with data in memory.

Best regards, Vitalii Tymchyshyn

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Chetan Suttraway 2011-03-23 11:24:10 Re: Re-Reason of Slowness of Query
Previous Message Adarsh Sharma 2011-03-23 11:21:17 Re: Re-Reason of Slowness of Query