Re: NOT IN Optmization

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>, David Wheeler <david(at)wheeler(dot)net>
Cc: <sfpug(at)postgresql(dot)org>
Subject: Re: NOT IN Optmization
Date: 2003-08-09 18:20:06
Message-ID: 200308091120.06564.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

David, Stephan,

> Maybe something of the form
> AND NOT EXISTS (SELECT 1 FROM job__resource
> WHERE job__id = 1027 and r.id = resource__id)
> will work better for the older systems (it sometimes does, it sometimes
> doesn't)

IME, NOT EXISTS works better than NOT IN on all 7.1 --> 7.3 for any subquery
result which is consistently 20 values or more.

HOWEVER, the subquery estimate for EXISTS clauses up to 7.4 is a hack ...
pretty much the planner assumes that there will be 50% correlation, which can
often result in an unnecessary seq scan (mind you, NOT IN is no better for
large lists). So it pays to "overdetermine" your criteria in the main query;
that is, add WHERE clauses to the main query even where they are reduntant
with the subquery to give the planner the right idea.

--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

Browse sfpug by date

  From Date Subject
Next Message David Wheeler 2003-08-09 20:05:08 Re: NOT IN Optmization
Previous Message Stephan Szabo 2003-08-09 05:44:25 Re: NOT IN Optmization