Re: Query Problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: josh(at)agliodbs(dot)com
Cc: sqllist <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Query Problem
Date: 2000-10-26 15:16:39
Message-ID: 5722.972573399@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Thus, I need to select:

> SELECT Data FROM Table A
> WHERE CaseID NOT IN (
> SELECT CaseID FROM Table_B, Table_C
> WHERE Table_B.GroupID = TableC.GroupID
> AND TableC.AccountID = 11)

> The problem is, since Table_B and Table_C are large (10,000 records +)
> this exclusion query takes several *minutes* to run.

I don't think there is any good way to make this fast in current
sources. A partial workaround is to use a temp table:

SELECT CaseID INTO TEMP TABLE mycaseids FROM Table_B, Table_C
WHERE Table_B.GroupID = TableC.GroupID
AND TableC.AccountID = 11;

CREATE INDEX mycaseids_idx ON mycaseids(caseid); -- critical!

SELECT Data FROM TableA upper
WHERE NOT EXISTS (select 1 from mycaseids where caseid = upper.caseid);

You'd need to check with EXPLAIN, but the EXISTS subplan should make
use of the index to probe the temp table, so you get one index lookup
per outer tuple. Better than a complete scan of the subselect outputs,
which is what you'll get with the NOT IN style.

In 7.1 it'll be possible to do this with an outer join, which should
be a lot quicker:

SELECT Data FROM TableA LEFT JOIN
(SELECT CaseID FROM Table_B, Table_C
WHERE Table_B.GroupID = TableC.GroupID
AND TableC.AccountID = 11) subselect
ON (tablea.caseid = subselect.caseid)
WHERE subselect.caseid IS NULL;

ie, do the outer join and then discard the successfully-matched rows.

Further down the pike, we have plans to make the system smart enough to
transform IN and NOT IN constructs into join-like queries automatically.
Right now, though, they're best rewritten into something else when
performance is important.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2000-10-26 15:21:44 Re: Alternate Database Locations
Previous Message Tom Lane 2000-10-26 15:00:24 Re: [SQL] pg_atoi: error in "template1": can't parse "template1"