Re: [HACKERS] Solution for LIMIT cost estimation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Solution for LIMIT cost estimation
Date: 2000-02-11 04:31:36
Message-ID: 18282.950243496@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
> It seems current cost estimation couldn't be converted into a*N+b
> form exactly. For example,the cost of seq scan is
> count of pages + count of tuples * cpu_weight + ..
> count of pages couldn't be converted into a*N form.
> The cost of index scan is more complicated.

It would not be an exact conversion, because the cost of a query is
clearly *not* a perfectly linear function of the number of tuples
retrieved before stopping. Another example, besides the ones you
mention, is that a nested-loop join will suffer a "hiccup" in
output rate each time it restarts the inner scan, if the inner scan
is of a kind that has nontrivial startup cost. But I believe that
this effect is not very significant compared to all the other
approximations the optimizer already has to make.

Basically, my proposal is that the plan cost estimation routines
provide a "startup cost" (cost expended to retrieve the first
tuple) in addition to the "total cost" they already estimate.
Then, upper-level planning routines that want to estimate the cost
of fetching just part of the query result will estimate that cost
by linear interpolation between the startup cost and the total
cost. Sure, it's just a rough approximation, but it'll be a long
time before that's the worst error made by the planner ;-)

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Bitmead 2000-02-11 04:57:20 Re: [HACKERS] libpq
Previous Message Tom Lane 2000-02-11 03:52:12 Re: [HACKERS] Solution for LIMIT cost estimation