Re: What about utility to calculate planner cost constants?

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: "Tambet Matiisen" <t(dot)matiisen(at)aprote(dot)ee>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: What about utility to calculate planner cost constants?
Date: 2005-03-22 04:42:50
Message-ID: 87acowcmut.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:

> > Otherwise it could just collect statements, run EXPLAIN ANALYZE for all
> > of them and then play with planner cost constants to get the estimated
> > values as close as possible to actual values. Something like Goal Seek
> > in Excel, if you pardon my reference to MS :).
>
> That's not really practical. There are currently 5 major query tuning
> parameters, not counting the memory adjustments which really can't be left
> out. You can't realistically test all combinations of 6 variables.

I don't think it would be very hard at all actually.

It's just a linear algebra problem with a bunch of independent variables and a
system of equations. Solving for values for all of them is a straightforward
problem.

Of course in reality these variables aren't actually independent because the
costing model isn't perfect. But that wouldn't be a problem, it would just
reduce the accuracy of the results.

What's needed is for the explain plan to total up the costing penalties
independently. So the result would be something like

1000 * random_page_cost + 101 * sequential_page_cost + 2000 * index_tuple_cost
+ ...

In other words a tuple like <1000,101,2000,...>

And explain analyze would produce the above tuple along with the resulting
time.

Some program would have to gather these values from the log or stats data and
gather them up into a large linear system and solve for values that minimize
the divergence from the observed times.

(costs penalties are currently normalized to sequential_page_cost being 1.
That could be maintained, or it could be changed to be normalized to an
expected 1ms.)

(Also, currently explain analyze has overhead that makes this impractical.
Ideally it could subtract out its overhead so the solutions would be accurate
enough to be useful)

--
greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2005-03-22 08:34:22 Re: What needs to be done for real Partitioning?
Previous Message Josh Berkus 2005-03-21 22:59:56 Re: What about utility to calculate planner cost constants?