Re: Q: Table scans on set difference

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alban Hertroys <alban(at)magproductions(dot)nl>
Cc: "G(dot) Ralph Kuntz, MD" <grk(at)usa(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: Q: Table scans on set difference
Date: 2006-07-14 15:36:23
Message-ID: 20529.1152891383@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alban Hertroys <alban(at)magproductions(dot)nl> writes:
> G. Ralph Kuntz, MD wrote:
>> explain select file_name from encounter_properties_table where file_name not
>> in (select filename from xfiles);

> What about:
> explain select file_name from encounter_properties_table
> where not exists (
> select file_name from xfiles where filename = file_name);

If you only need the file name, an EXCEPT would probably work much
better:

select file_name from encounter_properties_table
except
select filename from xfiles;

Another possibility is to abuse the outer join machinery:

select file_name, ... from
encounter_properties_table l left join xfiles r
on l.file_name = r.filename
where r.filename is null;

Generally speaking, NOT IN performance is going to suck unless the
sub-select is small enough to fit in a hashtable. You could consider
increasing work_mem enough that it would fit, but with 500K filenames
needed, that's probably not going to win.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2006-07-14 15:48:25 Re: apparent wraparound
Previous Message Florian Weimer 2006-07-14 15:31:38 Re: Timestamp vs timestamptz