Skip site navigation (1) Skip section navigation (2)

Re: VACUUM ANALYZE slows down query

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: werner fraga <wfraga(at)yahoo(dot)com>
Cc: John Arbash Meinel <john(at)arbash-meinel(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: VACUUM ANALYZE slows down query
Date: 2005-02-22 23:08:21
Message-ID: 13064.1109113701@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-performance
I wrote:
> Well, with the increased (and much more accurate) rowcount estimate,
> the estimated cost of the nestloop naturally went up a lot: it's
> proportional to the number of rows involved.  It appears that the
> estimated cost of the mergejoin actually went *down* quite a bit
> (else it'd have been selected the first time too).  That seems odd to
> me.

Nah, I just can't count :-(.  What I forgot about was the sub-select in
the output list:

>> select ToolRepairRequest.RequestID, (Select
>> count(ToolHistory.HistoryID) from ToolHistory where
>> ToolRepairRequest.RepairID=ToolHistory.RepairID) as
>> CountOfTH

which shows up in the (un-analyzed) EXPLAIN output here:

   SubPlan
     ->  Aggregate  (cost=524.17..524.17 rows=1 width=4) (actual time=0.032..0.035 rows=1 loops=1518)
           ->  Index Scan using th_repair_key on toolhistory  (cost=0.00..523.82 rows=140 width=4) (actual time=0.013..0.018 rows=1 loops=1518)
                 Index Cond: ($0 = repairid)

Now in this case the planner is estimating 79 rows out, so the estimated
cost of the nestloop plan includes a charge of 79*524.17 for evaluating
the subplan.  If we discount that then the estimated cost of the
nestloop plan is 3974.74..6645.99 (48055.42-79*524.17).

In the ANALYZEd case the subplan is estimated to be a lot cheaper:

   SubPlan
     ->  Aggregate  (cost=6.98..6.98 rows=1 width=4) (actual time=0.038..0.042 rows=1 loops=1518)
           ->  Index Scan using th_repair_key on toolhistory  (cost=0.00..6.97 rows=2 width=4) (actual time=0.016..0.021 rows=1 loops=1518)
                 Index Cond: ($0 = repairid)

It's estimated to be needed 1533 times, but that still adds up to less
of a charge than before.  Discounting that, the mergejoin plan was
estimated at 18310.59..18462.10 (29162.44 - 1533*6.98).  So it's not
true that the estimated cost of the join went down in the ANALYZEd case.

Werner sent me a data dump off-list, and trawling through the planner I
got these numbers for the estimated costs without the output subquery:

without any statistics:
	mergejoin cost	9436.42 .. 9571.81
	nestloop cost	3977.74 .. 6700.71

with statistics:
	mergejoin cost	18213.04 .. 18369.73
	nestloop cost	 4054.93 .. 24042.85

(these are a bit different from his results because of different ANALYZE
samples etc, but close enough)

So the planner isn't going crazy: in each case it chose what seemed the
cheapest total-cost plan.

			regards, tom lane

In response to

pgsql-performance by date

Next:From: Gaetano MendolaDate: 2005-02-23 01:03:10
Subject: Re: is pg_autovacuum so effective ?
Previous:From: Luke ChambersDate: 2005-02-22 22:08:05
Subject: Inefficient Query Plans

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group