Re: sub-query optimization

From: Brad Hilton <bhilton(at)vpop(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: sub-query optimization
Date: 2003-02-14 22:26:18
Message-ID: 1045261577.29960.105.camel@aragorn.vpop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, 2003-02-14 at 14:08, Tom Lane wrote:
> Brad Hilton <bhilton(at)vpop(dot)net> writes:
> > ... If I modify the query slightly:
>
> > --------
> > select 1 from article_categories
> > -->
> > select 1 from articles, article_categories
> > ---------
>
> > the query takes 98 msec.
>
> Yeah, because then the sub-query is a constant (it doesn't depend on the
> current outer row at all) and so it is only evaluated once, not once per
> outer row. Unfortunately, that approach probably gives the wrong
> answers...

Ah, that makes sense. But does it surprise you that when I manipulate
the dataset such that the inner query matches 0 records, the total query
takes so much longer?

Unfortunately, after following the suggestions of several kind posters,
the resulting queries are pretty slow compared to my example which used
'exists.'

The fact that the query takes so long in certain dataset conditions is
surprising me. Watch the following results:

psql> update categories set restrict_views = FALSE;

explain analyze select * from articles where exists
(select 1 from article_categories, categories
where
article_categories.article_id = articles.id and
categories.restrict_views = FALSE and
article_categories.category_id = categories.id
)
and
post_status = 'publish'
order by publish_time desc limit 10;
Total runtime: 0.69 msec

psql> update categories set restrict_views = TRUE;
explain analyze select * from articles where exists
(select 1 from article_categories, categories
where
article_categories.article_id = articles.id and
categories.restrict_views = FALSE and
article_categories.category_id = categories.id
)
and
post_status = 'publish'
order by publish_time desc limit 10;

Total runtime: 27490.84 msec

Is that a surprising result? I would think that the second time things
would be faster because there are no matches to the inner query. In
fact, if I execute the inner query by itself, minus the reference to the
articles table, it executes lightning fast. (0.07 msec)

-Brad

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Brad Hilton 2003-02-14 22:39:31 Re: sub-query optimization
Previous Message Tom Lane 2003-02-14 22:08:27 Re: sub-query optimization