From: | "Llew Sion Goodstadt" <leo(dot)goodstadt(at)human-anatomy(dot)oxford(dot)ac(dot)uk> |
---|---|
To: | <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: where not exists |
Date: | 2002-03-15 12:46:17 |
Message-ID: | 004801c1cc1f$663c39c0$1c1d01a3@FGU028 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I ended up by using an external programme.
NOT EXISTS is just a set difference.
Doing set compares is really quick if both sets are sorted.
I use CRC64s for the data and just compare the resulting sorted sets of
(large CRC 64-bit) numbers.
Because everything hashes to a number, the memory requirements are not
that bad either (8 bytes per item ~256000 tuples per Mb).
The programme is in C++ but is as fast in something like Perl.
I.e. comparing millions of rows of data takes 10s of seconds rather than
10s of minutes.
Leo
>
> 1) get items which are orphaned in a.
> CREATE TEMP TABLE orphans as
> SELECT a.join1, a.join2
> FROM a LEFT OUTER JOIN b ON a.join1 = b.join1 AND
> a.join2 = b.join2
> WHERE b.join1 IS NULL AND b.join2 IS NULL
>
> D A
>
>
> "Llew" <leo(dot)goodstadt(at)anat(dot)ox(dot)ac(dot)uk> wrote in message
> news:a65qm1$2k6g$1(at)jupiter(dot)hub(dot)org(dot)(dot)(dot)
> > Dear everyone,
> > What is the best way of removing rows which are not in
> another table?
From | Date | Subject | |
---|---|---|---|
Next Message | Enrico Mangano | 2002-03-15 14:28:02 | R: [SQL] UPDATE and SELECT result difference |
Previous Message | Peter Eisentraut | 2002-03-15 05:10:05 | Re: Boolean test |