Re: [SQL] Subselect performance

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Stuart Rison <rison(at)biochemistry(dot)ucl(dot)ac(dot)uk>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Lopez <ridruejo(at)atm9(dot)com(dot)dtu(dot)dk>, pgsql-sql(at)hub(dot)org
Subject: Re: [SQL] Subselect performance
Date: 1999-09-21 15:23:10
Message-ID: 199909211523.LAA11575@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> On Tue, 21 Sep 1999, Tom Lane wrote:
>
> > Daniel Lopez <ridruejo(at)atm9(dot)com(dot)dtu(dot)dk> writes:
> > > $list = select d from c
> > > select b from a where b in ( $list )
> > > is 5 seconds
> >
> > > select b from a where b in (select d from c)
> > > is 3 minutes!! (although it should be at least as fast as a)!
> >
> > Not necessarily. Your first example is depending on the fact that
> > the "list" (number of values selected from c) is short. Try it with
> > 10000 or 100000 values from c, if you want to see the backend crash ;-)
>
> I've encoutered this sort of issue myself where I just wanted the
> sub-select to be performed once. Granted it would not work if you wanted
> to select 10000 or 100000 but what if you have a very larged table a and a
> very small table c (using the example above). As you pointed out,
> currently you're looking at 'a x c' runtime... Ugh indeed; whereas just
> executing the subselect once and cut and pasting that you have an order of
> 'a' runtime...

OK, I am jumping in here, because it seems we have some strange
behavour.

The only subselect problem I know of is that:

select b from a where b in (select d from c)

will execute the subquery just once, but will do a sequential scan for
of the subquery results for each row of 'a' looking for 'b' that is in
the set of result rows.

This is a major performance problem, one that is known, and one that
should be fixed, but I am sounding like a broken record.

The solution is to allow the subquery results to be mergejoined(sorted),
or hashjoined with the outer query.

Am I correct, or is something else going on here?

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Bruce Momjian 1999-09-21 15:24:35 Re: [SQL] OID & indexes.
Previous Message Tom Lane 1999-09-21 15:10:30 Re: [SQL] OID & indexes.