Solution for LIMIT cost estimation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Solution for LIMIT cost estimation
Date: 2000-02-10 16:48:15
Message-ID: 15706.950201295@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We have discussed in the past the need for the optimizer to take LIMIT
into account when choosing plans. Currently, since planning is done
on the basis of total plan cost for retrieving all tuples, there's
no way to prefer a "fast start" plan over "slow start". But that's
what you want if there's a small LIMIT.

Up to now I haven't seen a practical way to fix this, because the
optimizer does its planning bottom-up (start with scan plans, then
make joins, etc) and there's no easy way to know at the bottom of
the pyramid whether you can expect to take advantage of a LIMIT that
exists at the top. For example, if there's a SORT or GROUP step
in between, you can't apply the LIMIT to the bottom level; but the
bottom guys don't know whether there will be such a step.

I have thought of a fairly clean way to attack this problem, which
is to represent the cost of a plan in two parts instead of only one.
Instead of just "cost", have "startup cost" and "cost per tuple".
(Actually, it'd probably be easier to work with "startup cost" and
"total cost if all tuples are retrieved", but that's a trivial
representational detail; the point is that our cost model will now be
of the form a*N+b when N tuples are retrieved.) It'd be pretty easy
to produce plausible numbers for all the plan types we use. Then,
the plan comparators would keep any plan that wins on either startup
or total cost, rather than only considering total cost. Finally
at the top level of planning, when there is a LIMIT the preferred
plan would be selected by comparing a*LIMIT+b rather than total cost.

I think I can get this done before beta, but before I go into hack-
attack mode I wanted to run it up the flagpole and see if anyone
has any complaints or better ideas.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-02-10 16:55:00 Re: [HACKERS] psql and libpq fixes
Previous Message Ross J. Reedstrom 2000-02-10 16:34:50 Re: [INTERFACES] The persistance of C functions