Re: why do we need two snapshots per query?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why do we need two snapshots per query?
Date: 2011-11-13 17:32:08
Message-ID: CA+Tgmob+ui+Jo-tWH_Vs=JQ4u4pj4r49kcDV=LkOkQre2bkz6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 13, 2011 at 11:09 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> If we could be certain that a query was being executed immediately
>
> ... that is, with the same snapshot ...
>
>> then it would be possible to simplify expressions using stable
>> functions as if they were constants. My earlier patch did exactly
>> that.
>
> Mph.  I had forgotten about that aspect of it.  I think that it's
> very largely superseded by Marti Raudsepp's pending patch:
> https://commitfest.postgresql.org/action/patch_view?id=649
> which does more and doesn't require any assumption that plan and
> execution snapshots are the same.
>
> Now you're going to say that that doesn't help for failure to prove
> partial index or constraint conditions involving stable functions,
> and my answer is going to be that that isn't an interesting use-case.
> Partial index conditions *must* be immutable, and constraint conditions
> *should* be.  As far as partitioning goes, the correct solution there
> is to move the partition selection to run-time, so we should not be
> contorting query semantics to make incremental performance improvements
> with the existing partitioning infrastructure.
>
> I remain of the opinion that Robert's proposal is a bad idea.

Wait a minute. I can understand why you think it's a bad idea to
preserve a snapshot across multiple protocol messages
(parse/bind/execute), but why or how would it be a bad idea to keep
the same snapshot between planning and execution when the whole thing
is being done as a unit? You haven't offered any real justification
for that position, and it seems to me that if anything the semantics
of such a thing are far *less* intuitive than it would be to do the
whole thing under a single snapshot. The whole point of snapshot
isolation is that our view of the database doesn't change mid-query;
and yet you are now saying that's exactly the behavior we should have.
That seems exactly backwards to me.

I also think you are dismissing Simon's stable-expression-folding
proposal far too lightly. I am not sure that the behavior he wants is
safe given the current details of our implementation - or even with my
patch; I suspect a little more than that is needed - but I am pretty
certain it's the behavior that users want and expect, and we should be
moving toward it, not away from it. I have seen a significant number
of cases over the years where the query optimizer generated a bad plan
because it did less constant-folding than the user expected. Users do
not walk around thinking about the fact that the planner and executor
are separate modules and therefore probably should use separate
snapshots. They expect their query to see a consistent view of the
database. Period.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Urbański 2011-11-13 17:45:25 splitting plpython into smaller parts
Previous Message Thom Brown 2011-11-13 17:22:17 Re: Detach/attach database