Re: NOT IN doesn't use index? (fwd)

From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Becky Neville <rebecca(dot)neville(at)yale(dot)edu>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: NOT IN doesn't use index? (fwd)
Date: 2003-05-03 21:16:41
Message-ID: 1051996600.1656.36.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, 2003-05-03 at 15:56, Becky Neville wrote:
> I think that list is actually (gulp) hard coded. It's not my query. I am
> trying to speed it up for someone else - to hopefully learn something in
> the process that isn't dependent on what version of postgres i'm
> running :)

An interesting test might be to see if the overhead of doing a character
based comparison (as opposed to integer based) is significant. If it
is, previous tests show it can be significant for CPU bound queries,
convert all of those codes into integers and use a lookup table table to
do the conversion.

Another interesting thought, since you have a long running query would
be to attempt an inversion. Create a temporary table with the *valid*
codes if count(valid codes) < 2 * count(invalid codes). Run the query
replacing NOT IN with a join to the temporary table. This will reduce
the number of comparisons required, as a match can move onto the next
datum, but a NOT IN must check all values. If this helps, try indexing
(and analyzing) the temporary table.

By far the fastest results can be achieved by not allowing invalid
billing codes to be inserted into the table via a constraint of somekind
(check or fkey to summary table).

--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2003-05-04 02:28:52 Suggestions wanted for 7.2.4 query
Previous Message Ron Johnson 2003-05-03 20:39:34 Re: Looking for a cheap upgrade (RAID)