Re: Re-Reason of Slowness of Query

From: Adarsh Sharma <adarsh(dot)sharma(at)orkash(dot)com>
To: Chetan Suttraway <chetan(dot)suttraway(at)enterprisedb(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, tv(at)fuzzy(dot)cz
Subject: Re: Re-Reason of Slowness of Query
Date: 2011-03-23 11:21:17
Message-ID: 4D89D7AD.4050503@orkash.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thank U all, for U'r Nice Support.

Let me Conclude the results, below results are obtained after finding
the needed queries :

*First Option :

*pdc_uima=# explain analyze select distinct(p.crawled_page_id)
pdc_uima-# from page_content p left join clause2 c on (p.crawled_page_id =
pdc_uima(# c.source_id) where (c.source_id is null);

QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=100278.16..104104.75 rows=382659 width=8) (actual
time=87927.000..87930.084 rows=72 loops=1)
-> Nested Loop Anti Join (cost=0.00..99320.46 rows=383079 width=8)
(actual time=0.191..87926.546 rows=74 loops=1)
-> Seq Scan on page_content p (cost=0.00..87132.17
rows=428817 width=8) (actual time=0.027..528.978 rows=428467 loops=1)
-> Index Scan using idx_clause2_source_id on clause2 c
(cost=0.00..18.18 rows=781 width=4) (actual time=0.202..0.202 rows=1
loops=428467)
Index Cond: (p.crawled_page_id = c.source_id)
Total runtime: 87933.882 ms :-(
(6 rows)

*Second Option :

*pdc_uima=# explain analyze select distinct(p.crawled_page_id) from
page_content p
pdc_uima-# where NOT EXISTS (select 1 from clause2 c where c.source_id
= p.crawled_page_id);

QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=100278.16..104104.75 rows=382659 width=8) (actual
time=7047.259..7050.261 rows=72 loops=1)
-> Nested Loop Anti Join (cost=0.00..99320.46 rows=383079 width=8)
(actual time=0.039..7046.826 rows=74 loops=1)
-> Seq Scan on page_content p (cost=0.00..87132.17
rows=428817 width=8) (actual time=0.008..388.976 rows=428467 loops=1)
-> Index Scan using idx_clause2_source_id on clause2 c
(cost=0.00..18.18 rows=781 width=4) (actual time=0.013..0.013 rows=1
loops=428467)
Index Cond: (c.source_id = p.crawled_page_id)
Total runtime: 7054.074 ms :-)
(6 rows)

Thanks & best Regards,
Adarsh Sharma

Chetan Suttraway wrote:
>
>
> On Wed, Mar 23, 2011 at 4:08 PM, <tv(at)fuzzy(dot)cz <mailto:tv(at)fuzzy(dot)cz>> wrote:
>
> > I just want to retrieve that id 's from page_content which do
> not have
> > any entry in clause2 table.
>
> In that case the query probably does not work (at least the query
> you've
> sent in the first post) as it will return even those IDs that have at
> least one other row in 'clause2' (not matching the != condition).
> At least
> that's how I understand it.
>
> true.
>
>
> So instead of this
>
> select distinct(p.crawled_page_id)
> from page_content p, clause2 c where p.crawled_page_id !=
> c.source_id ;
>
> you should probably do this
>
> select distinct(p.crawled_page_id)
> from page_content p left join clause2 c on (p.crawled_page_id =
> c.source_id) where (c.source_id is null);
>
> I guess this will be much more efficient too.
>
>
> This looks like to give expected results. Also note that the where
> clause "is null" is really required and is not an
> optional predicate.
>
>
>
> regards
> Tomas
>
>
> --
> Sent via pgsql-performance mailing list
> (pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
>
>
>
> --
> Regards,
> Chetan Suttraway
> EnterpriseDB <http://www.enterprisedb.com/>, The Enterprise PostgreSQL
> <http://www.enterprisedb.com/> company.
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vitalii Tymchyshyn 2011-03-23 11:21:38 Re: Re-Reason of Slowness of Query
Previous Message Chetan Suttraway 2011-03-23 10:44:22 Re: Re-Reason of Slowness of Query