Re: Avoiding bad prepared-statement plans.

From: Jeroen Vermeulen <jtv(at)xs4all(dot)nl>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alex Hunsaker <badalex(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Bart Samwel <bart(at)samwel(dot)tk>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Avoiding bad prepared-statement plans.
Date: 2010-02-26 10:20:17
Message-ID: 4B87A061.40206@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mark Mielke wrote:

> Re-planning a generic plan with another generic plan may generate zero
> benefit, with a measurable cost. More on this after...

Nobody's talking about doing that any more. I proposed it initially
because I didn't know about changes that made it unnecessary.

> All the points about ms seem invalid to me. There are many reason why ms
> could increase, and many of them have nothing to do with plan
> efficiency. Again, re-planning due to a high ms, or a high ratio of ms,
> does not indicate that re-planning will improve the success of the plan.
> The planning process does not measure ms or predict ms.

That's true, but missing some very basic points about the idea: one, if
we can tell that a query is going to be expensive, then the cost of
re-planning it is marginal. Two, if we can tell that a query is going
to be expensive, then we stand a lot to gain if re-planning turns out to
be useful. It follows that we can afford to re-plan on the off-chance,
without anything more than a vague orders-of-magnitude idea of what
"expensive" means.

What Tom said validates a big assumption I've been making: that we do in
fact have a decent shot at telling in advance that a query is going to
be expensive. Which means we have a decent shot at stopping your 100ms
query from taking seconds just because you prepared it and are missing
out on that tiny partial index. That would be worth the extra planning
time at a 1% hit rate, and there's not much downside if we don't reach that.

> My idea of an optimal system is as follows:
>
> 1) Prepare gathers and caches data about the tables involved in the
> query, including column statistics that are likely to be required during
> the planning process, but prepare does not running the planning process.

It sounds to me like you're in the process of inventing another planning
process. Developer time aside, how much CPU time can you afford to
throw at this?

I don't see any reason to argue over what would be optimal when so much
information is still missing. It just makes the problem look harder
than it is. To me, our best shot at getting something useful is to stay
simple and defensive. After that, if there is still a need, we'll have
code to help us gather more data and figure out how to make it better.
Nothing wrong with the lowest-hanging fruit.

Jeroen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Boszormenyi Zoltan 2010-02-26 11:12:10 Re: NaN/Inf fix for ECPG
Previous Message Bernd Helmle 2010-02-26 09:41:26 Re: pg_stop_backup does not complete