Re: [SQL] Subselect performance

From: Stuart Rison <rison(at)biochemistry(dot)ucl(dot)ac(dot)uk>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: Stuart Rison <rison(at)biochemistry(dot)ucl(dot)ac(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Lopez <ridruejo(at)atm9(dot)com(dot)dtu(dot)dk>, pgsql-sql(at)hub(dot)org
Subject: Re: [SQL] Subselect performance
Date: 1999-09-21 15:55:23
Message-ID: Pine.LNX.4.10.9909211645170.23146-100000@bsmlx17
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Tue, 21 Sep 1999, Bruce Momjian wrote:
> OK, I am jumping in here, because it seems we have some strange
> behavour.
>
> The only subselect problem I know of is that:
>
> select b from a where b in (select d from c)
>
> will execute the subquery just once, but will do a sequential scan for
> of the subquery results for each row of 'a' looking for 'b' that is in
> the set of result rows.

Oh, OK, that's very possible. I was always under the impression (very
possibly missguided) that the reason it took a long time to do a "in
(select...)" was that the sub-select was actually executed for every row
in 'a' so that you ended up doing:

1x sequential scan of a
ax select on c

whereas if you did the sub-select ide[endently and cut-and-pasted the
obtained set into the "in (...)" you were in point of fact just doing:

1X sequential scan of a (each of them with loads of OR statements).

therefore saving "ax select" time.

Bruce, I appologise if I've completely missunderstood what's going on and
that your e-mail was all about correcting me. I don't have a good grasp
of seq-scan vs. (nested-)joins vs. hash joins vs. mergejoins etc.
(although any pointers on where to get a crash course in these would be
greatly appreciated).

> This is a major performance problem, one that is known, and one that
> should be fixed, but I am sounding like a broken record.

yeah, again appologise if this has been discussed to death in the past and
I missed it all (or it went over my head ;) )

> The solution is to allow the subquery results to be mergejoined(sorted),
> or hashjoined with the outer query.

erm...

> Am I correct, or is something else going on here?

most probably correct... :)

regards,

S.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Jackson, DeJuan 1999-09-21 16:06:17 RE: [SQL] Large char field(s)
Previous Message Martin Dolog 1999-09-21 15:43:57 Large char field(s)