Re: More thoughts about planner's cost estimates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: More thoughts about planner's cost estimates
Date: 2006-06-02 22:36:40
Message-ID: 11319.1149287800@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> In general it seems to me that for CPU-bound databases, the default values
> of the cpu_xxx_cost variables are too low. ... rather than telling people
> to manipulate all three of these variables individually, I think it might
> also be a good idea to provide a new GUC variable named something like
> "cpu_speed_scale" that would just be a multiplier for the other variables.
> It would default to 1.0 and our standard advice for CPU-bound databases
> would be "decrease random_page_cost to 1.0 and raise cpu_speed_scale to
> 10.0 or so". Seems cleaner than telling people to muck with three or so
> individual values.

Nicolai Petri's comment about per-tablespace access costs caused me to
rethink the above proposal. Instead of inventing "cpu_speed_scale",
which seems rather baroque after thinking about it more, what I now
think we should do is invent a "seq_page_cost" GUC to replace the
traditionally hardwired value of 1.0 cost unit per sequential page
fetch. Then, if you've got different tablespaces with different disk
speeds, you could imagine having per-tablespace values of seq_page_cost
and random_page_cost, whereas you probably want the CPU cost numbers
to remain the same across all tables.

I don't really want to get into inventing per-tablespace settings right
now, because the need hasn't been demonstrated; but if we ever do want
to do it, this approach will be a whole lot less confusing than
something involving a cpu_speed_scale knob.

This still leaves you twiddling two knobs (now random_page_cost and
seq_page_cost) if you want to set up the planner for an all-in-memory
database. So it's not any more complicated for that purpose.

One objection to this is that after moving "off the gold standard" of
1.0 = one page fetch, there is no longer any clear meaning to the
cost estimate units; you're faced with the fact that they're just an
arbitrary scale. I'm not sure that's such a bad thing, though. For
instance, some people might want to try to tune their settings so that
the estimates are actually comparable to milliseconds of real time.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Woodward 2006-06-02 22:44:43 Re: COPY (query) TO file
Previous Message Todd A. Cook 2006-06-02 22:24:33 Re: More thoughts about planner's cost estimates