Re: Avoiding bad prepared-statement plans.

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeroen Vermeulen <jtv(at)xs4all(dot)nl>, Alex Hunsaker <badalex(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Bart Samwel <bart(at)samwel(dot)tk>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Avoiding bad prepared-statement plans.
Date: 2010-02-26 18:53:38
Message-ID: 4B8818B2.8030703@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/26/2010 11:27 AM, Tom Lane wrote:
> Also, I think there is a lot of confusion here over two different
> issues: generic plan versus parameter-specific plan, and bad planner
> estimates leading to a wrong plan choice. While the latter is certainly
> an issue sometimes, there is no reason to believe that it affects
> prepared statements worse than non-prepared ones. So I think that
> designing a fix for prepared statements on the assumption that you can't
> trust the planner's estimates is solving the wrong problem.
>

Just to point out that I agree, and as per my original post, I think the
only time prepared statements should be re-planned for the statistics
case, is after 'analyze' has run. That sounds like a quicker solution,
and a much smaller gain. After 'analyze' of an object, invalidate all
cached plans for prepared statements that rely on that object and
require a re-plan. I doubt this will help me or many others very often.
It's something that should be done some day, but I don't recall ever
concluding that a performance problem I was experiencing was related to
using prepared statements too long. Also, the client is able to figure
this out. The client can choose to free prepared statements after 1
minute or 1000 calls. It's not really a problem.

It also has nothing to do with trust of the planner's estimates. Given
the same criteria, the planner should come up with the same best plan
most or all of the time. Trial and error planning, with the exception of
hugely complicated plans that cannot be produced in a finite time frame,
does not appeal to me at all. I do trust the planner's estimates.

The issue of specific parameter is the one I think most of us would
benefit from, and I think the most effective benefit is to not create
generic plans. I would prefer a prepare with specific plan and re-plan
when the specific plan does not apply, over generic plan, every time.
This has nothing to do with "time to prepare" or a ratio of "time to
prepare" vs "time to execute", or plans that are expected to take some
time to execute. The fact that I can run a PREPARE/EXECUTE, and SELECT,
and with only one invocation see a difference of over 100X shows that
generic plans is just not the right approach. It works according to
spec, but it is not practical under the current model.

Generic plans is the problem. My post was to bring attention to this, as
I see most comments focusing on an assumption that generic plans provide
value, and specific plans should only be used when generic plans are
expected to take a while to execute. It's walking around the problem
that the idea of a generic plan is just wrong. The only time a generic
plan is right, is when the specific plan would result in the same.

Cheers,
mark

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gokulakannan Somasundaram 2010-02-26 18:54:17 Re: A thought on Index Organized Tables
Previous Message Tom Lane 2010-02-26 18:53:16 Re: Re: Hot Standby query cancellation and Streaming Replication integration