From: | Greg Stark <stark(at)mit(dot)edu> |
---|---|
To: | decibel <decibel(at)decibel(dot)org> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Multi-pass planner |
Date: | 2013-04-04 01:40:50 |
Message-ID: | CAM-w4HMmxpmM0BfaS53YwU37q4wPFTp0zohuEwgD0vGven8=tQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 21, 2009 at 6:54 PM, decibel <decibel(at)decibel(dot)org> wrote:
> Would it? Risk seems like it would just be something along the lines of
> the high-end of our estimate. I don't think confidence should be that hard
> either. IE: hard-coded guesses have a low confidence. Something pulled
> right out of most_common_vals has a high confidence. Something estimated
> via a bucket is in-between, and perhaps adjusted by the number of tuples.
>
I used to advocate a similar idea. But when questioned on list I tried to
work out the details and ran into some problem coming up with a concrete
plan.
How do you compare a plan that you think has a 99% chance of running in 1ms
but a 1% chance of taking 1s against a plan that has a 90% chance of 1ms
and a 10% chance of taking 100ms? Which one is actually riskier? They might
even both have the same 95% percentile run-time.
And additionally there are different types of unknowns. Do you want to
treat plans where we have a statistical sample that gives us a
probabilistic answer the same as plans where we think our model just has a
10% chance of being wrong? The model is going to either be consistently
right or consistently wrong for a given query but the sample will vary from
run to run. (Or vice versa depending on the situation).
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2013-04-04 01:49:20 | Re: Page replacement algorithm in buffer cache |
Previous Message | Andres Freund | 2013-04-04 01:39:00 | Re: corrupt pages detected by enabling checksums |