Re: Avoiding bad prepared-statement plans.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeroen Vermeulen <jtv(at)xs4all(dot)nl>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bart Samwel <bart(at)samwel(dot)tk>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Avoiding bad prepared-statement plans.
Date: 2010-02-26 03:19:21
Message-ID: 603c8f071002251919m6f43747bi336680806c40adc4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 25, 2010 at 9:48 PM, Jeroen Vermeulen <jtv(at)xs4all(dot)nl> wrote:
> Robert Haas wrote:
>>
>> On Wed, Feb 17, 2010 at 5:52 PM, Jeroen Vermeulen <jtv(at)xs4all(dot)nl> wrote:
>>>
>>> I may have cut this out of my original email for brevity... my impression
>>> is
>>> that the planner's estimate is likely to err on the side of scalability,
>>> not
>>> best-case response time; and that this is more likely to happen than an
>>> optimistic plan going bad at runtime.
>>
>> Interestingly, most of the mistakes that I have seen are in the
>> opposite direction.
>
> I guess there's not much we can do about those, except decide after running
> that it's worth optimizing for specific values.
>
>
>>> Yeb points out a devil in the details though: the cost estimate is
>>> unitless.
>>>  We'd have to have some orders-of-magnitude notion of how the estimates
>>> fit
>>> into the picture of real performance.
>>
>> I'm not sure to what extent you can assume that the cost is
>> proportional to the execution time.  I seem to remember someone
>> (Peter?) arguing that they're not related by any fixed ratio, partly
>> because things like page costs vs. cpu costs didn't match physical
>> reality, and that in fact some attempts to gather better empirically
>> better values for things like random_page_cost and seq_page_cost
>> actually ended up making the plans worse rather than better.  It would
>> be nice to see some research in this area...
>
> Getting representative workloads and machine configurations may make that
> hard.  :/
>
> But all we really want is a check for really obscene costs, as an extra
> stopgap so we don't have to wait for the thing to execute before we decide
> it's too costly.  Surely there must be some line we can draw.

I actually think there isn't any clean line. Obscene is in the eye of
the beholder. Frankly, I think this discussion is getting off into
the weeds. It would be nice, perhaps, to have a feature that will
detect when the generic plan is the suxxor and attempt to find a
better one, but that's really, really hard for a whole bunch of
reasons. Bruce's suggestion that we should provide some user control
over whether we plan at bind time or execute time seems likely to be
(1) considerably simpler to implement, (2) considerably easier to get
consensus on, and (3) capable of giving 90% of the benefit for an only
higher inconvenience factor.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-02-26 03:29:29 Re: visibility maps and heap_prune
Previous Message Greg Stark 2010-02-26 03:11:11 Re: Assertion failure in walreceiver