Re: Improving N-Distinct estimation by ANALYZE

From: Rod Taylor <pg(at)rbt(dot)ca>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: tshipley(at)deru(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving N-Distinct estimation by ANALYZE
Date: 2006-01-05 13:59:41
Message-ID: 1136469581.6629.83.camel@home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > Do you *really* want the median estimate in these case? Are you certain you
> > do not want something with the opposite behavior of Chaudhuri's estimate so
> > that for small sample sizes the bias is toward a high estimate of D?
> > (Converges on D from the right instead of the left.)
> >
> > Chaudhuri's <-----D------------------> needed
> > Estimate estimate
>
> Hmmm. Yeah, I see what you mean. True, the ideal approach would to
> deterime for each query operation whether a too-low D or a too-high D
> was more risky, and then use the more conservative number. However,
> that would complicate the query planner enough that I think Tom would
> leave us. :-p

You could have some specific functions vote themselves out if their cost
is shakey. We know that the cost of a miscalculated nestloop is huge, so
after calculating the common case it might apply a multiplier for the
"risk" involved.

There have been lots of requests for a way to achieve more consistent
plans that have a determined worst case performance, even if they never
perform as well in the best case as another algorithm might. Perhaps
this could be a GUC.

PlanCost + PlanCost * Risk * RiskGUC

"Risk" is a number that indicates how badly things can go wrong.

"RiskGUC" is an integer multiplier. Someone who is risk averse (wants a
predictable execution time rather than the best possible time) would set
this value high. Others who want the best possible plan in most cases
even if it performs poorly once in a while will set the value very low,
possibly 0.

--

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2006-01-05 14:04:06 Re: [PATCHES] TRUNCATE, VACUUM, ANALYZE privileges
Previous Message Marko Kreen 2006-01-05 10:31:28 Re: Heads up: upcoming back-branch re-releases