Re: A costing analysis tool

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: A costing analysis tool
Date: 2005-10-14 18:37:37
Message-ID: 24296.1129315057@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> I propose capturing only three values from the output of explain
> analyze, and saving it with many columns of context information.

You really have to capture the rowcounts (est and actual) too.
Otherwise you can't tell if it's a costing problem or a statistics
problem.

More generally, I think that depending entirely on EXPLAIN ANALYZE
numbers is a bad idea, because the overhead of EXPLAIN ANALYZE is both
significant and variable depending on the plan structure. The numbers
that I think we must capture are the top-level EXPLAIN cost and the
actual runtime of the query (*without* EXPLAIN). Those are the things
we would like to get to track closely. EXPLAIN ANALYZE is incredibly
valuable as context for such numbers, but it's not the thing we actually
wish to optimize.

> Besides the additional context info, I expect to be storing the log
> of the ratio, since it seems to make more sense to average and
> look for outliers based on that than the raw ratio.

Why would you store anything but raw data? Easily-derivable numbers
should be computed while querying the database, not kept in it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message C Wegrzyn 2005-10-14 19:15:13 Re: BUG #1962: ECPG and VARCHAR
Previous Message Kevin Grittner 2005-10-14 17:57:50 Re: A costing analysis tool