Re: Prepared statements fail after schema changes with surprising error

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter van Hardenberg <pvh(at)pvh(dot)ca>, Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Prepared statements fail after schema changes with surprising error
Date: 2013-01-22 17:44:19
Message-ID: 22921.1358876659@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Jan 22, 2013 at 2:17 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I think what's happening is that this bit:
>>> DROP SCHEMA public;
>>> CREATE SCHEMA public;
>> changes the OID of schema public, whereas the search_path that's cached
>> for the cached plan is cached in terms of OIDs. So while there is a
>> table named public.z1 at the end of the sequence, it's not in any schema
>> found in the cached search path.
>>
>> We could possibly fix that by making the path be cached as textual names
>> not OIDs, but then people would complain (rightly, I think) that
>> renaming a schema caused unexpected behavior.

> What sort of unexpected behavior?

After reflecting on this a bit, I think that the problem may come from
drawing an unjustified analogy between views and prepared statements.
The code is certainly trying to treat them as the same thing, but
perhaps we shouldn't do that.

Consider that once you do
create view v as select * from s.t;
the view will continue to refer to the same table object no matter what.
You can rename t, you can rename s, you can move t to a different schema
and then drop s, but the view still knows what t is, because the
reference is by OID. The one thing you can't do is drop t, because the
stored dependency from v to t will prevent that (at least unless you let
it cascade to drop v as well). Views therefore do not have, or need,
any explicit dependencies on either specific schemas or their
creation-time search_path --- they only have dependencies on individual
objects.

The current plancache code is trying, in a somewhat half-baked fashion,
to preserve those semantics for prepared queries --- that's partly
because it's reusing the dependency mechanism that was designed for
views, and partly because it didn't occur to us to question that model.
But it now strikes me that the model doesn't apply very well, so maybe
we need a new one. The key point that seems to force a different
treatment is that there are no stored (globally-visible) dependencies
for prepared queries, so there's no way to guarantee that referenced
objects don't get dropped.

We could possibly set things up so that re-executing a prepared query
that references now-dropped objects would throw an error; but what
people seem to prefer is that it should be re-analyzed to see if the
original source text would now refer to a different object. And we're
doing that, but we haven't followed through on the logical implications.
The implication, ISTM, is that we should no longer consider that
referring to the same objects throughout the query's lifespan is a goal
at all. Rather, what we should be trying to do is make the query
preparation transparent, in the sense that you should get the same
results as if you resubmitted the original query text each time.

In particular, it now seems to me that this makes a good argument
for changing what plancache is doing with search_path. Instead of
re-establishing the original search_path in a rather vain hope that the
same objects will be re-selected by parse analysis, we should consider
that the prepared query has a dependency on the active search path, and
thus force a replan if the effective search path changes.

I'm not sure that we can make the plan caching 100% transparent, though.
The existing mechanisms will force replan if any object used in the plan
is modified (and fortunately, "modified" includes "renamed", even though
a rename isn't interesting according to the view-centric worldview).
And we can force replan if the search path changes (ie, the effective
list of schema OIDs changes). But there are cases where neither of
those things happens and yet the user might expect a new object to be
selected. Consider for example that the search path is a, b, c,
and we have a prepared query "select * from t", and that currently
refers to b.t. If now someone creates a.t, or renames a.x to a.t,
then a replan would cause the query to select from a.t ... but there
was no invalidation event that will impinge on the stored plan, and the
search_path setting didn't change either. I don't think we want to
accept the overhead of saying "any DDL anywhere invalidates all cached
plans", so I don't see any good way to make this case transparent.
How much do we care?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-01-22 18:22:01 Re: Prepared statements fail after schema changes with surprising error
Previous Message Gavin Flower 2013-01-22 17:09:02 Re: CF3+4 (was Re: Parallel query execution)