Re: order of nested loop

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joseph Shraibman <joseph(at)xtenit(dot)com>
Cc: pgsql-general(at)postgreSQL(dot)org
Subject: Re: order of nested loop
Date: 2003-06-17 03:32:05
Message-ID: 8882.1055820725@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Joseph Shraibman <joseph(at)xtenit(dot)com> writes:
> Tom Lane wrote:
>> Joseph Shraibman <jks(at)selectacast(dot)net> writes:
>>> I have two queries that return identical results. One is a SELECT
>>> DISTINCT and the other is the same query without the DISTINCT.
>>
>> Could we see the actual queries? And the complete EXPLAIN ANALYZE
>> output?

> (private email, bec I don't want to sanatize this huge thing for the list)

Okay, but people send bigger things than this to the list all the time...

> [fast case]
> (SELECT distinct
> u.userkey,u.adminper,u.addper,u.approvper,u.ownerper,u.subper,u.banned,u.status,u.config,u.rules, ...
> ORDER BY lower(d.username) LIMIT 25)

> -> Limit (cost=36903.81..36916.31 rows=25 width=604) (actual
> time=10.92..10.93 rows=1 loops=1)
> -> Unique (cost=36903.81..37128.43 rows=449 width=604)
> (actual time=10.92..10.92 rows=1 loops=1)
> -> Sort (cost=36903.81..36915.04 rows=4492
> width=604) (actual time=10.91..10.91 rows=1 loops=1)
> Sort Key: lower(d.username), u.userkey,
> u.adminper, u.addper, u.approvper, u.ownerper, u.subper, u.banned, u.status, u.co
> nfig, u.rules, (subplan), CASE WHEN ((u.rules IS NULL) OR (u.rules = ''::text)) THEN
> ''::text ELSE 'r'::text END, d.username, d.realname, d.firstname, (subplan), d.st
> atus, 2
> -> Nested Loop (cost=0.00..36631.27
> rows=4492 width=604) (actual time=10.65..10.67 rows=1 loops=1)
> -> Index Scan using
> usertable_podkey_key on usertable u (cost=0.00..16003.53 rows=5446 width=555) (actual time=0.
> 04..0.05 rows=1 loops=1)
> Index Cond: (podkey = 20)
> Filter: ((status = 2) AND (NOT
> banned))
> -> Index Scan using directory_pkey on
> directory d (cost=0.00..3.78 rows=1 width=49) (actual time=0.02..0.03 rows=
> 1 loops=1)
> Index Cond: (d.userkey =
> "outer".userkey)
> Filter: ((status = 2) OR (status
> = 5))

> [slow case]
> (SELECT
> u.userkey,u.adminper,u.addper,u.approvper,u.ownerper,u.subper,u.banned,u.status,u.config,u.rules, ...
> ORDER BY lower(d.username) LIMIT 25)

> -> Limit (cost=0.00..30602.38 rows=25 width=604) (actual
> time=74810.10..102624.62 rows=1 loops=1)
> -> Nested Loop (cost=0.00..5499145.64 rows=4492
> width=604) (actual time=74810.10..102624.61 rows=1 loops=1)
> -> Index Scan using directory_lower_username_key
> on directory d (cost=0.00..2380577.42 rows=525568 width=49) (actual time=15.61..79588.09
> rows=589718 loops=1)
> Filter: ((status = 2) OR (status = 5))
> -> Index Scan using usertable_pkey on usertable u
> (cost=0.00..5.92 rows=1 width=555) (actual time=0.04..0.04 rows=0 loops=589718)
> Index Cond: (("outer".userkey = u.userkey)
> AND (u.podkey = 20))
> Filter: ((status = 2) AND (NOT banned))

As near as I can tell, the failure of estimation is not in the slow case
--- the planner correctly estimates that this plan is expensive.
Rather, it's in the fast case, because the planner mistakenly thinks
that that is also expensive. The critical error is right here:

> -> Index Scan using
> usertable_podkey_key on usertable u (cost=0.00..16003.53 rows=5446 width=555) (actual time=0.
> 04..0.05 rows=1 loops=1)
> Index Cond: (podkey = 20)
> Filter: ((status = 2) AND (NOT
> banned))

That scan is estimated to yield 5446 rows and it only yields 1. Do you
have any idea why the estimate is so far off? (My guess is that podkey,
status and banned are correlated to a large extent, but you tell us.)

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Greg Stark 2003-06-17 04:34:44 Re: full featured alter table?
Previous Message Ernest E Vogelsinger 2003-06-17 02:54:56 Re: [PERFORM] Interesting incosistent query timing