Re: Searching for the cause of a bad plan

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Csaba Nagy <nagy(at)ecircle-ag(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Searching for the cause of a bad plan
Date: 2007-09-21 23:30:34
Message-ID: 27857.1190417434@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> That's not my perspective. If the LIMIT had been applied accurately to
> the cost then the hashjoin would never even have been close to the
> nested join in the first place.

[ shrug... ] Your perspective is mistaken. There is nothing wrong with
the way the LIMIT estimation is being done. The plan in question was

Limit (cost=0.00..498511.80 rows=10 width=1804) (actual time=17.729..21.672 rows=2 loops=1)
-> Nested Loop (cost=0.00..1794642.48 rows=36 width=1804) (actual time=17.729..21.671 rows=2 loops=1)
-> Index Scan using pk_table_a on table_a ta (cost=0.00..324880.88 rows=388638 width=16) (actual time=0.146..0.198 rows=2 loops=1)
Index Cond: (a = $1)
-> Index Scan using pk_table_b2 on table_b2 tb (cost=0.00..3.77 rows=1 width=1788) (actual time=10.729..10.731 rows=1 loops=2)
Index Cond: (ta.b = tb.b)
Total runtime: 21.876 ms

and there are two fairly serious estimation errors here, neither related
at all to the LIMIT:

* five-orders-of-magnitude overestimate of the number of table_a rows
that will match the condition on a;

* enormous underestimate of the number of join rows --- it's apparently
thinking only 0.0001 of the table_a rows will have a join partner,
whereas at least for this case they all do.

Had the latter estimate been right, the cost of pulling results this
way would indeed have been something like 50K units per joined row,
because of the large number of inner index probes per successful join.

It might be interesting to look into why those estimates are so far
off; the stats Csaba displayed don't seem to have any obvious oddity
that would justify such bizarre results. But the LIMIT has got
nothing to do with this.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Denes Daniel 2007-09-22 00:08:43 Re: Query planner unaware of possibly best plan
Previous Message brauagustin-susc 2007-09-21 21:45:27 Re: Low CPU Usage