Re: Proper Join and check

From: "Billy G(dot) Allie" <Bill(dot)Allie(at)mug(dot)org>
To: Patrick Nelson <pnelson(at)neatech(dot)com>
Cc: "PostgreSQL List (E-mail)" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Proper Join and check
Date: 2002-10-07 06:01:13
Message-ID: 200210070601.g9761DI08907@bajor.mug.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Patrick Nelson wrote:
> Billy G. Allie wrote:
> ----------------->>>>
> The query you are looking for is:
>
> select a.sym from tableone a
> where a.sym not in (select b.sym from tablemany b
> where b.sym = a.sym);
>
> This query will run MUCH faster if you create a secondary index for
> tablemany (in fact, don't even try it without the index for any significant
> number or rows):
[. . .]
> Oh yeah it helped... Thanks the query took like 4 seconds. I'm not sure I
> totally understand the b.sym = a.sym though.

With the b.sym = a.sym, the query optimizer will use an index scan of tablemany.
Without it, a sequential scan will be used (very VERY S L O W for any signifi-
cant number of rows).

Without the b.sym = a.sym, the result of the subselect will be all the rows in
tablemany which will have to be scanned to see if the test (not in) succeeds.
With the b.sym = a.sym, the result of the subselect will be empty or the value
of b.sym (as determined by an index lookup, which is why the index was needed).

I hope this clarifies things somewhat.
___________________________________________________________________________
____ | Billy G. Allie | Domain....: Bill(dot)Allie(at)mug(dot)org
| /| | 7436 Hartwell | MSN.......: B_G_Allie(at)email(dot)msn(dot)com
|-/-|----- | Dearborn, MI 48126|
|/ |LLIE | (313) 582-1540 |

Browse pgsql-general by date

  From Date Subject
Next Message Shridhar Daithankar 2002-10-07 07:15:57 Re: understanding insert slowdown
Previous Message Tom Lane 2002-10-07 03:20:33 cross-posts (was Re: Large databases, performance)