Re: Optimizer Not using the Right plan

From: "galy lee" <galylee4pg(at)gmail(dot)com>
To: pkalva(at)livedatagroup(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimizer Not using the Right plan
Date: 2007-12-05 12:49:33
Message-ID: 58acf4b70712050449h5be72805la82fcc5905f0a133@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I think you may increase the row number that you want to limit, like LIMIT
50.

LIMIT can change the cost of a plan dramatically. Looking in your SQL:

where this_.fkaddressid= 6664161
order by this_.addressvaluationid desc limit 1;

Planner may use either index1(this_.fkaddressid) or
index2(this_.addressvaluationid) to scan the table. Although it is obvious
that using index2 is very expensive, but because you are trying to limit one
row from 1304 row, so the cost of using index2 changes to

883328/1304=677.69

The cost of using index1 should be lager than 1304, so planner chooses
index2.

Planner tends to choose a plan which has small startup cost when you are
trying to LIMIT a small portion of data over a large data set. It seems that
the following issue also comes from the same root.

http://archives.postgresql.org/pgsql-performance/2007-11/msg00395.php
Best Regards
Galy Lee

> Tom Lane wrote:
>
> Pallav Kalva <pkalva ( at ) livedatagroup ( dot ) com> writes:
>
> why does it have different plans for different values
>
> Because the values occur different numbers of times (or so it thinks
> anyway). If the rowcount estimates are far from reality, perhaps
> increasing the statistics target would help. However, since you
> only showed EXPLAIN and not EXPLAIN ANALYZE output, no one can
> really tell whether the optimizer did anything wrong here.
>
> regards, tom lane
>
> Hi Tom,
>
>
> Thanks! for your reply, here is an another example of the same query with
> different addressid now. This time I got the explain analyze on the query,
> this query also uses the Index Scan Backwards, it says it took 28 seconds
> but I can say that after looking at the postgres logs it took more than 2
> min when the query first ran. I ran this one again to get the explain analyze.
>
>
> The statistics set to "default_statistics_target = 100"
>
>
> I am sure if it uses index on addressid it would be quicker but for some
> reason it using index backward scan on addressvaluationid and that is
> taking too long.
>
> Not only this one there are some other queries which use index scan backwards
> scan and it takes too long. Index scan backwards most of the time is not
> doing good for me is there any way to avoid it ?
>
> explain analyze
>
> select this_.addressvaluationid as addressv1_150_1_, this_.sourcereference
> as sourcere2_150_1_, this_.createdate as createdate150_1_,
> this_.valuationdate as valuatio4_150_1_, this_.valuationamount as
> valuatio5_150_1_, this_.valuationhigh as valuatio6_150_1_,
> this_.valuationlow as valuatio7_150_1_, this_.valuationconfidence as
> valuatio8_150_1_, this_.valuationchange as valuatio9_150_1_, this_.historycharturl
> as history10_150_1_, this_.regionhistorycharturl as regionh11_150_1_, this_.fkaddressid
> as fkaddre12_150_1_, this_.fkaddressvaluationsourceid as fkaddre13_150_1_,
> this_.fkvaluationchangeperiodid as fkvalua14_150_1_, valuationc2_.valuationchangeperiodid
> as valuatio1_197_0_,
>
> valuationc2_.name as name197_0_
>
> from listing.addressvaluation this_ left outer join
> listing.valuationchangeperiod valuationc2_ on this_.fkvaluationchangeperiodid=valuationc2_.valuationchangeperiodid
>
>
> where this_.fkaddressid= 6664161
> order by this_.addressvaluationid desc limit 1;
>
> QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.00..677.69 rows=1 width=494) (actual time=
> 28454.708..28454.712 rows=1 loops=1) -> Nested Loop Left Join (cost=
> 0.00..883705.44 rows=1304 width=494) (actual time=28454.700..28454.700rows=1 loops=1) ->
> Index Scan Backward using pk_addressvaluation_addressvaluationid on
> addressvaluation this_ (cost=0.00..883328.22 rows=1304 width=482) (actual
> time=28441.236..28441.236 rows=1 loops=1)
>
> Filter: (fkaddressid = 6664161)
>
> -> Index Scan using pk_valuationchangeperiod_valuationchangeperiodid on valuationchangeperiod
> valuationc2_ (cost=0.00..0.28 rows=1 width=12) (actual time=13.447..13.447rows=1 loops=1) Index
> Cond: (this_.fkvaluationchangeperiodid = valuationc2_.valuationchangeperiodid)
>
>
> Total runtime: 28454.789 ms
> (7 rows)
>
>
>

Browse pgsql-performance by date

  From Date Subject
Next Message Bill Moran 2007-12-05 18:25:44 Re: Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)
Previous Message Julian Mehnle 2007-12-05 11:26:56 Re: Bad query plans for queries on partitioned table