Skip site navigation (1) Skip section navigation (2)

Re: Expensive where clause

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Keith Worthington <keithw(at)narrowpathinc(dot)com>
Cc: PostgreSQL Novice <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Expensive where clause
Date: 2005-02-19 15:11:56
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-novice
On Sat, 19 Feb 2005, Keith Worthington wrote:

> On Fri, 18 Feb 2005 21:55:29 -0800 (PST), Stephan Szabo wrote
> > On Fri, 18 Feb 2005, Keith Worthington wrote:
> >
> > > I have been working on a SQL statement that contains a WHERE
> > > clause of the form WHERE column1 > column2.  The query runs
> > > pretty quickly (285ms) without the WHERE clause but slows to
> > > a relative crawl (5850ms) when it is included.
> > > Any suggestions on how to improve the performance would be
> > > greatly appreciated.
> >
> > Explain analyze output for the query with and without the clause
> > would probably be useful for analysis.
>                          Here is the explain analyze output with the WHERE
> clause commented out.  This one actually ran slow.  Usually it is only a few
> hundred ms without the WHERE clause.

That's probably just the instrumentation.

I'm not 100% sure why it's changing plans although I wonder if the costs
are just close enough that small changes are causing the plan change, but
I think it wouldn't pick a nested loop if it knew that it was grossly
underestimating the number of loops. It might be interesting to see how
the second query runs in explain analyze with enable_nestloop=off although
that'll likely make lower portions of the query more expensive.

One thing that jumps out at me is scans like this:
                           ->  Index Scan using idx_tbl_item_item_type,
idx_tbl_item_item_type on tbl_item items  (cost=0.00..84.16 rows=1
(actual time=0.093..5.702 rows=566 loops=1)
                                  Index Cond: (((item_type)::text =
 'DIR'::text) OR ((item_type)::text = 'NET'::text))
                            Filter: ((NOT inactive) AND (item_class = 1))

This misestimation may be playing a part in why it thinks a nested loop is
a good plan. Is there a strong correlation between some item_types and
item_class or inactive?  What does explain analyze on the following
queries show?

select * from tbl_item where (item_type='DIR OR item_type='NET');
select * from tbl_item where (item_type='DIR OR item_type='NET') AND NOT
select * from tbl_item where (item_type='DIR OR item_type='NET') AND
select * from tbl_item where (item_type='DIR OR item_type='NET') AND NOT
inactive AND item_class=1;

I'm not sure if you'd get any win from a partial index with WHERE NOT
inactive (or possibly both NOT inactive AND item_class=1 if you're almost
always limiting item_class to 1), but that might also be something to

In response to

pgsql-novice by date

Next:From: A GilmoreDate: 2005-02-21 01:32:48
Subject: Concatenate rows
Previous:From: Keith WorthingtonDate: 2005-02-19 06:40:45
Subject: Re: Expensive where clause

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group