Skip site navigation (1) Skip section navigation (2)

Re: Re-Reason of Slowness of Query

From: Shaun Thomas <sthomas(at)peak6(dot)com>
To: Adarsh Sharma <adarsh(dot)sharma(at)orkash(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Re-Reason of Slowness of Query
Date: 2011-03-23 13:34:29
Message-ID: 4D89F6E5.7010803@peak6.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 03/23/2011 04:17 AM, Adarsh Sharma wrote:

> explain analyze select distinct(p.crawled_page_id) from page_content
> p where NOT EXISTS (select 1 from clause2 c where c.source_id =
> p.crawled_page_id);

You know... I'm surprised nobody has mentioned this, but DISTINCT is 
very slow unless you have a fairly recent version of Postgres that 
replaces it with something faster. Try this:

EXPLAIN ANALYZE
SELECT p.crawled_page_id
   FROM page_content p
  WHERE NOT EXISTS (
          SELECT 1
            FROM clause2 c
           WHERE c.source_id = p.crawled_page_id
        )
  GROUP BY p.crawled_page_id;

Or if you like the cleaner query without a sub-select:

EXPLAIN ANALYZE
SELECT p.crawled_page_id
   FROM page_content p
   LEFT JOIN clause2 c ON (c.source_id = p.crawled_page_id)
  WHERE c.source_id IS NULL
  GROUP BY p.crawled_page_id;

-- 
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas(at)peak6(dot)com

______________________________________________

See  http://www.peak6.com/email_disclaimer.php
for terms and conditions related to this email

In response to

Responses

pgsql-performance by date

Next:From: Uwe BartelsDate: 2011-03-23 13:39:55
Subject: Re: buffercache/bgwriter
Previous:From: Jochen ErwiedDate: 2011-03-23 13:19:59
Subject: Re: buffercache/bgwriter

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group