Re: Avoiding bad prepared-statement plans.

From: Richard Huxton <dev(at)archonet(dot)com>
To: Jeroen Vermeulen <jtv(at)xs4all(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Avoiding bad prepared-statement plans.
Date: 2010-02-09 14:45:57
Message-ID: 4B717525.2010907@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/02/10 14:25, Jeroen Vermeulen wrote:
> Richard Huxton wrote:
>
>>> = Actual-cost threshold =
>>>
>>> Also stop using the generic plan if the statement takes a long time to
>>> run in practice.
>>
>> Do you mean:
>> 1. Rollback the current query and start again
>> 2. Mark the plan as a bad one and plan again next execute
>>
>> If you can figure out how to do #1 then you could probably do it for
>> all queries, but I'm guessing it's far from simple to implement.
>
> I'm talking about #2. As a matter of fact #1 did come up in one of those
> discussions, but how do you know you're not killing the query juuust
> before it'd done, and then maybe executing a different plan that's no
> better?

Ah, you'd need to be smarter when planning and also remember the
expected rows from each node. That way if your (index driven) inner node
was expecting 3 rows you could mark it to force a cancellation if it
returns (say) 30 or more. You'd allow more slack in later processing and
less slack earlier on where a bad estimate can explode the final number
of rows.

Or, there is always the case where we reverse-search an index to find
the last 10 messages in a group say, but the particular group in
question hasn't had a comment for months, so you trawl half the table.
People regularly get bitten by that, and there's not much to be done
about it. If we could abort when it looks like we're in worst-case
rather than best-case scenarios then it would be one less thing for
users to worry about.

>>> = Plan refresh =
>>>
>>> Periodically re-plan prepared statements on EXECUTE. This is also a
>>> chance for queries that were being re-planned every time to go back to a
>>> generic plan.
>>
>> Presumably some score based on update stats and vacuum activity etc.
>
> I was thinking of something very simple: re-do whatever we'd do if the
> statement were only being prepared at that point.

Yes, I thought so, the scoring was for *when* to decide to cancel the
old plan. I suppose total query-time would be another way to decide this
plan needs reworking.

>> The good side of all these ideas is good indeed. The bad side is plan
>> instability. Someone somewhere will have a generic plan that turns out
>> better than the specific plan (due to bad stats or config settings or
>> just planner limitations). The question is (I guess): How many more
>> winners will there be than losers?
>
> That's a good and surprising point, and therefore I'd like to draw
> attention away to a different point. :-)
>
> Yes, there will be losers in the sense that people may have optimized
> their use of prepared statements to whatever the current planner does.
> Maybe somebody out there even deliberately uses them to trick the
> planner into a different plan. But that is always going to happen; we're
> aiming for better plans, not for giving more detailed control over them.
> If you really can't take a change, don't upgrade.
>
> The competing point is: people out there may currently be forgoing
> prepared statements entirely because of erratic performance. To those
> people, if we can help them, it's like having a new feature.

Oh, I'm persuaded, but that doesn't really get you anywhere :-)

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Mielke 2010-02-09 14:53:17 Re: Avoiding bad prepared-statement plans.
Previous Message Korry Douglas 2010-02-09 14:37:35 Re: ERROR: could not load library "...": Exec format error