Quick Links

Re: [SQL] Subselect performance

From:	Stuart Rison <rison(at)biochemistry(dot)ucl(dot)ac(dot)uk>
To:	Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc:	Stuart Rison <rison(at)biochemistry(dot)ucl(dot)ac(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Lopez <ridruejo(at)atm9(dot)com(dot)dtu(dot)dk>, pgsql-sql(at)hub(dot)org
Subject:	Re: [SQL] Subselect performance
Date:	1999-09-21 15:55:23
Message-ID:	Pine.LNX.4.10.9909211645170.23146-100000@bsmlx17
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

On Tue, 21 Sep 1999, Bruce Momjian wrote:
> OK, I am jumping in here, because it seems we have some strange
> behavour.
>
> The only subselect problem I know of is that:
>
> select b from a where b in (select d from c)
>
> will execute the subquery just once, but will do a sequential scan for
> of the subquery results for each row of 'a' looking for 'b' that is in
> the set of result rows.

Oh, OK, that's very possible. I was always under the impression (very
possibly missguided) that the reason it took a long time to do a "in
(select...)" was that the sub-select was actually executed for every row
in 'a' so that you ended up doing:

1x sequential scan of a
ax select on c

whereas if you did the sub-select ide[endently and cut-and-pasted the
obtained set into the "in (...)" you were in point of fact just doing:

1X sequential scan of a (each of them with loads of OR statements).

therefore saving "ax select" time.

Bruce, I appologise if I've completely missunderstood what's going on and
that your e-mail was all about correcting me. I don't have a good grasp
of seq-scan vs. (nested-)joins vs. hash joins vs. mergejoins etc.
(although any pointers on where to get a crash course in these would be
greatly appreciated).

> This is a major performance problem, one that is known, and one that
> should be fixed, but I am sounding like a broken record.

yeah, again appologise if this has been discussed to death in the past and
I missed it all (or it went over my head ;) )

> The solution is to allow the subquery results to be mergejoined(sorted),
> or hashjoined with the outer query.

erm...

> Am I correct, or is something else going on here?

most probably correct... :)

regards,

In response to

Re: [SQL] Subselect performance at 1999-09-21 15:23:10 from Bruce Momjian

Responses

Re: [SQL] Subselect performance at 1999-09-21 19:18:24 from Bruce Momjian

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Jackson, DeJuan	1999-09-21 16:06:17	RE: [SQL] Large char field(s)
Previous Message	Martin Dolog	1999-09-21 15:43:57	Large char field(s)