Re: BUG #7556: "select not in sub query" plan very poor vs "not exists"

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-bugs(at)postgresql(dot)org>,<l1t(at)tom(dot)com>
Subject: Re: BUG #7556: "select not in sub query" plan very poor vs "not exists"
Date: 2012-09-19 18:06:51
Message-ID: 5059C36B020000250004A505@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

<l1t(at)tom(dot)com> wrote:

> test=# explain select max(a.info)from sli_test a where a.id not
> in(select b.id from sli_test2 b where b.id<50000);

> Aggregate (cost=9241443774.00..9241443774.01 rows=1 width=12)

A slower plan for NOT IN than NOT EXISTS is a fact, but definitely
not a bug. According to the standard, the semantics of NOT IN are
different from NOT EXISTS when there is a possibility of NULLs on
either side. Because of those different semantics, NOT IN cannot
use certain optimizations which are available for NOT EXISTS.

Technically, if a NOT IN case could be analyzed to the point where
it is clear that both sides of the predicate are definitely free of
NULLs, it could be run the same as NOT EXISTS, but that would add
complexity and run-time expense to the optimizer. Some cases are
simple, but some are not at all practical. We have chosen instead
to recommend that people use NOT EXISTS unless they really want the
rather astonishing behavior of NOT IN.

-Kevin

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Reinhard Max 2012-09-19 21:05:09 Re: BUG #7559: syslogger doesn't close stdout and stderr
Previous Message Tom Lane 2012-09-19 16:39:12 Re: BUG #7559: syslogger doesn't close stdout and stderr