Skip site navigation (1) Skip section navigation (2)

Re: index scan of whole table, can't see why

From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: "Dan Langille" <dan(at)langille(dot)org>
Cc: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: index scan of whole table, can't see why
Date: 2005-01-20 21:38:19
Message-ID: 200501210838.19638.mr-russ@pws.com.au (view raw or flat)
Thread:
Lists: pgsql-performance
On Fri, 21 Jan 2005 02:36 am, Dan Langille wrote:
> On 20 Jan 2005 at 7:26, Stephan Szabo wrote:

[snip]
> > Honestly I expected it to be slower (which it was), but I figured it's
> > worth seeing what alternate plans it'll generate (specifically to see how
> > it cost a nested loop on that join to compare to the fast plan).
> > Unfortunately, it generated a merge join, so I think it might require both
> > enable_hashjoin=false and enable_mergejoin=false to get it which is likely
> > to be even slower in practice but still may be useful to see.
> 
> Setting both to false gives a dramatic performance boost.  See 
> http://rafb.net/paste/results/b70KAi42.html
> 
         ->  Materialize  (cost=15288.70..15316.36 rows=2766 width=35) (actual time=0.004..0.596 rows=135 loops=92)
               ->  Nested Loop  (cost=0.00..15288.70 rows=2766 width=35) (actual time=0.060..9.130 rows=135 loops=1)

The Planner here has a quite inaccurate guess at the number of rows that will match in the join.  An alternative to 
turning off join types is to up the statistics on the Element columns because that's where the join is happening.  Hopefully the planner will
get a better idea.  However it may not be able too.  2766 rows vs 135 is quite likely to choose different plans.  As you can
see you have had to turn off two join types to give something you wanted/expected.

> This gives suitable speed, but why does the plan vary so much with 
> such a minor change in the WHERE clause?
Plan 1 - broken
       ->  Nested Loop  (cost=0.00..3825.30 rows=495 width=35) (actual time=0.056..16.161 rows=218 loops=1)

Plan 2 - deprecated
        ->  Hash Join  (cost=3676.78..10144.06 rows=2767 width=35) (actual time=7.638..1158.128 rows=135 loops=1)

The performance difference is when the where is changed, you have a totally different set of selection options.
The Plan 1 and Plan 2 shown from your paste earlier, report that you are out by a factor of 2 for plan 1.  But for plan 2
its a factor of 20.  The planner is likely to make the wrong choice when the stats are out by that factor.

Beware what is a small "typing" change does not mean they queries are anything alight.

Regards

Russell Smith.

In response to

Responses

pgsql-performance by date

Next:From: Matt CastersDate: 2005-01-20 21:39:02
Subject: Re:
Previous:From: Alex TurnerDate: 2005-01-20 21:35:12
Subject: Re: [SQL] OFFSET impact on Performance???

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group