Re: A costing analysis tool

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: A costing analysis tool
Date: 2005-10-13 05:01:36
Message-ID: 17817.1129179696@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Note that I'm talking about a tool strictly to check the accuracy of
> the estimated costs of plans chosen by the planner, nothing else.

We could definitely do with some infrastructure for testing this.
I concur with Bruce's suggestion that you should comb the archives
for previous discussions --- but if you can work on it, great!

> (2) A large database must be created for these tests, since many
> issues don't show up in small tables. The same data must be generated
> in every database, so results are comparable and reproducable.

Reproducibility is way harder than it might seem at first glance.
What's worse, the obvious techniques for creating reproducible numbers
amount to eliminating variables that are important in the real world.
(One of which is size of database --- some people care about
performance of DBs that fit comfortably in RAM...)

Realistically, the planner is never going to have complete information.
We need to design planning models that generally get the right answer,
but are not so complicated that they are (a) impossible to maintain
or (b) take huge amounts of time to compute. (We're already getting
some flak on the time the planner takes.) So there is plenty of need
for engineering compromise here. Still, you can't engineer without
raw data, so I'm all for creating a tool that lets us gather real-world
cost data.

The only concrete suggestion I have at the moment is to not design the
tool directly around "measure the ratio of real time to cost". That's
only meaningful if the planner's cost model is already basically correct
and you are just in need of correcting the cost multipliers. What we
need for the near term is ways of quantifying cases where the cost
models are just completely out of line with reality.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2005-10-13 06:01:40 Re: A costing analysis tool
Previous Message Tom Lane 2005-10-13 04:28:56 Re: [HACKERS] Darwin compile fixes