Re: Fuzzy cost comparison to eliminate redundant planning

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fuzzy cost comparison to eliminate redundant planning
Date: 2004-03-29 15:00:48
Message-ID: 200403291500.i2TF0mm20947@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Right. There are potentially some ranges of LIMIT for which it could
> >> win, I believe.
>
> > What if we take the total cost and divide it by the number of rows returned ---
> > then we have a per-row cost for each plan. Then we subtract the two, and
> > that difference compared to the difference in startup costs tell us how
> > many rows could potentially use this plan.
>
> Here, plan B loses everywhere: to A at lower percentages and to C at
> higher ones. But I don't see how to eliminate B on the basis of
> one-at-a-time comparisons. It seems like getting rid of B would require
> turning add_path into an O(N^3) instead of O(N^2) algorithm ... which
> would almost certainly eat more cycles than it'd save.

Nice charts. :-)

I agree we don't want anything that is O(high), but I was thinking of
something that would be more agressive than 1%, which works well for
lots of self joins, but I am not sure how well for other cases.
Consider these plans that return 10 rows:

total startup total per row retrieved
plan1 1 1 .1
plan2 5 0.5 .5
plan3 10 0 1

Now, the difference between plan1 and plan2 total is 500%, yet it is a
useless plan. If you want to retrieve one row, you pick plan3, if you
want 2 or more rows, you pick plan1.

If the per-row total cost plus the startup cost is less than another's,
we can throw it away. In fact, when we go check for cheapest startup
cost to keep, do we at least assume we have one row fetched?

What if instead of doing total cost 1% difference, we compute
total-per-row + startup being 1% different? Does that catch more
similar cost plans? Seems it would.

In your example, how many rows were returned? I can see how this would
have handled that case.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Bowlby 2004-03-29 15:07:44 Row sampling..
Previous Message Teodor Sigaev 2004-03-29 14:45:49 Re: GIST code doesn't build on strict 64-bit machines