| From: | Alfred Perlstein <bright(at)wintelcom(dot)net> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | pgsql-committers(at)postgresql(dot)org |
| Subject: | Re: pgsql/src/test/regress/expected (union.out) |
| Date: | 2000-10-05 21:34:44 |
| Message-ID: | 20001005143444.I27736@fw.wintelcom.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers |
* Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> [001005 14:32] wrote:
> Alfred Perlstein <bright(at)wintelcom(dot)net> writes:
> >> Reimplementation of UNION/INTERSECT/EXCEPT.
>
> > Does this mean that in the next release EXCEPT will be a lot faster?
> > Will I probably be able to drop my "NOT EXISTS" hacks that I've
> > been using?
>
> UNION/INTERSECT/EXCEPT are now all basically a sort phase and a
> unique-filter phase, with minor variations on what the unique filter
> thinks it should output. So the cost should be O((M+N) log (M+N)) for
> M+N input tuples, as opposed to O(M*N) for the old INTERSECT and
> EXCEPT code.
>
> I didn't do anything to change EXISTS ...
A while back I think you helped me, I was using EXCEPT, but the perf
was really awful:
* Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> [000510 16:22] wrote:
> Alfred Perlstein <bright(at)wintelcom(dot)net> writes:
> > =# select ref_id from ref_old except select ref_id from ref_new;
> > Takes over 10 minutes, probably closer to half an hour.
> > I've also tried using 'NOT IN ( select ref_id from ref_new )'
>
> Yup. EXCEPT is effectively translated to a NOT IN, if I recall
> correctly, and neither IN ( sub-select ) nor NOT IN ( sub-select )
> are implemented very efficiently. Basically you get O(N^2) behavior
> because the inner select is rescanned for each outer tuple.
>
> We have a TODO list item to try to be smarter about this...
>
> > Is there a way to formulate my SQL to get Postgresql to follow
> > this algorithm [ kind of like a mergejoin ]
>
> No, but you could try
>
> select ref_id from ref_old where not exists
> (select ref_id from ref_new where ref_id = ref_old.ref_id);
>
> which would at least be smart enough to consider using an index
> on ref_new(ref_id) instead of a sequential scan.
Is this what you fixed?
--
-Alfred Perlstein - [bright(at)wintelcom(dot)net|alfred(at)freebsd(dot)org]
"I have the heart of a child; I keep it in a jar on my desk."
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2000-10-05 21:42:19 | Re: pgsql/src/test/regress/expected (union.out) |
| Previous Message | Tom Lane | 2000-10-05 21:32:22 | Re: pgsql/src/test/regress/expected (union.out) |