Re: Strange query optimization in 7.3.2

From: Alec Mitchell <apm13(at)columbia(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Strange query optimization in 7.3.2
Date: 2003-04-14 22:20:29
Message-ID: 200304141520.30037.apm13@columbia.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Monday 14 April 2003 01:25 pm, Tom Lane wrote:
> I think the reason for the change in plan is the same bug discussed at
> http://fts.postgresql.org/db/mw/msg.html?mid=1064055
>
> However, you will probably not like the fix, since it eliminates the
> bogusly small cost estimate for the duplicated index condition, and
> thereby ensures that your less-favored plan will always be chosen :-(
>
> What would be interesting is to look into why the planner's estimated
> costs are inaccurate. I think the main cause is the badly-off join
> estimate for the tr/t join --- notice it's estimating 1119 rows out
> where only 52 are actually produced. The nestloop's runtime is directly
> proportional to the number of outer rows, so this leads directly to a
> factor-of-20 overestimate of the nestloop's cost, discouraging the
> planner from using it. The bug that's triggered by the duplicate
> index condition underestimates the cost, thereby negating that error to
> some extent.
>
> You should look into whether increasing the statistics targets for
> t.terminal and tr.terminal would improve the accuracy of the join
> estimate.

That bug does seem to be the cause of the confusing plan change. I really
don't understand that bad estimate for the join though (the estimate of 1119
is for the tr/m join, rather than the tr/t join).

Here are some details about the tables and joins involved:

The tr/t join produces 52 rows with unique trailers (the primary key on tr)
out of the 750 available (the planner estimates 62). These are then joined
with the manifests table m, which has 13526 rows. The relationship between
tr.trailer and m.trailer is a bit complex. Of the 750 possible trailer
values in tr, 607 have a one to one mapping to rows in m. The remaining 143
values are each referenced in 1-70 (avg 24) different rows in m.
Additionally, there are 9510 rows in m (the vast majority), which have a null
value for trailer (perhaps that is the cause of these bad statistics).

This particular query happens to call only trailers with a one to one
relationship to rows in m (this not unlikely considering that this condition
is true for 607 out of the 750 values). Setting the statistics targets for
m.trailer to 200 or even 750, both of which should be overkill considering
that there are only 750 possible values, and then performing a VACUUM ANALYZE
strangely makes no difference to the row estimates.

I'm having a lot of trouble tracking down the reason for this bad estimate
(all the preceding estimates are quite good). Any further guidance would be
greatly appreciated.

Thanks,
Alec Mitchell

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ed L. 2003-04-14 22:27:10 SPI question re elog() and aborted transactions?
Previous Message Vilson farias 2003-04-14 21:34:33 Help using pgfsck