Skip site navigation (1) Skip section navigation (2)

Re: Reason of Slowness of query

From: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To: Adarsh Sharma <adarsh(dot)sharma(at)orkash(dot)com>
Cc: Chetan Suttraway <chetan(dot)suttraway(at)enterprisedb(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Reason of Slowness of query
Date: 2011-03-23 08:51:25
Message-ID: 4D89B48D.80800@gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
23.03.11 09:30, Adarsh Sharma ???????(??):
> Thanks Chetan, here is the output of your updated query :
>
>
> *explain  select distinct(p.crawled_page_id) from page_content p where 
> NOT EXISTS (select 1 from  clause2 c where c.source_id = 
> p.crawled_page_id);
>
> *
>                                       QUERY PLAN
> ---------------------------------------------------------------------------------------
>  HashAggregate  (cost=1516749.47..1520576.06 rows=382659 width=8)
>    ->  Hash Anti Join  (cost=1294152.41..1515791.80 rows=383071 width=8)
>          Hash Cond: (p.crawled_page_id = c.source_id)
>          ->  Seq Scan on page_content p  (cost=0.00..87132.17 
> rows=428817 width=8)
>          ->  Hash  (cost=771182.96..771182.96 rows=31876196 width=4)
>                ->  Seq Scan on clause2 c  (cost=0.00..771182.96 
> rows=31876196 width=4)
> (6 rows)
>
> And my explain analyze output is :
>
>                                                       QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=1516749.47..1520576.06 rows=382659 width=8) 
> (actual time=56666.181..56669.270 rows=72 loops=1)
>    ->  Hash Anti Join  (cost=1294152.41..1515791.80 rows=383071 
> width=8) (actual time=45740.789..56665.816 rows=74 loops=1)
>          Hash Cond: (p.crawled_page_id = c.source_id)
>          ->  Seq Scan on page_content p  (cost=0.00..87132.17 
> rows=428817 width=8) (actual time=0.012..715.915 rows=428467 loops=1)
>          ->  Hash  (cost=771182.96..771182.96 rows=31876196 width=4) 
> (actual time=45310.524..45310.524 rows=31853083 loops=1)
>                ->  Seq Scan on clause2 c  (cost=0.00..771182.96 
> rows=31876196 width=4) (actual time=0.055..23408.884 rows=31853083 
> loops=1)
>  Total runtime: 56687.660 ms
> (7 rows)
>
> But Is there is any option to tune it further and one more thing 
> output rows varies from 6 to 7.
You need an index on source_id to prevent seq scan, like the next:
CREATE INDEX idx_clause2_source_id
   ON clause2
   (source_id);*

*Best regards, Vitalii Tymchyshyn

In response to

pgsql-performance by date

Next:From: Adarsh SharmaDate: 2011-03-23 09:17:51
Subject: Re-Reason of Slowness of Query
Previous:From: Adarsh SharmaDate: 2011-03-23 07:30:26
Subject: Re: Reason of Slowness of query

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group