Re: why do we need two snapshots per query?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why do we need two snapshots per query?
Date: 2011-11-13 23:13:05
Message-ID: CA+TgmoYfg1DkGqnLAmmgtvhRoGXp11gX_WN_-VTkdwtmJ+0qXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 13, 2011 at 12:57 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Wait a minute.  I can understand why you think it's a bad idea to
>> preserve a snapshot across multiple protocol messages
>> (parse/bind/execute), but why or how would it be a bad idea to keep
>> the same snapshot between planning and execution when the whole thing
>> is being done as a unit?  You haven't offered any real justification
>> for that position,
>
> It's not hard to come by: execution should proceed with the latest
> available view of the database.

The word "latest" doesn't seem very illuminating to me. If you take
that to its (illogical) conclusion, that would mean that we ought to
do everything under SnapshotNow - i.e. every time we fetch a tuple,
use the "latest" available view of the database. It seems to me that
you can wrap some logic around this - we shouldn't use a snapshot
taken later than <event1> because <reason1>, and we shouldn't use one
taken earlier than <event2> because <reason2>.

It seems to me that the *latest* snapshot we could use would be one
taken the instant before we did any calculation whose result might
depend on our choice of snapshot. For example, if the query involves
calculating pi out to 5000 decimal places (without looking at any
tables) and then scanning for the matching value in some table column,
we could do the whole calculation prior to taking a snapshot and then
take the snapshot only when we start groveling through the table.
That view would be "later" than the one we use now, and but still
correct.

On the other hand, it seems to me that the *earliest* snapshot we can
use is one taken the instant after we receive the protocol message
that tells us to execute the query. If we take it any sooner than
that, we might fail to see as committed some transaction which was
acknowledged before the user sent the message.

Between those two extremes, it seems to me that when exactly the
snapshot gets taken is an implementation detail.

>> and it seems to me that if anything the semantics
>> of such a thing are far *less* intuitive than it would be to do the
>> whole thing under a single snapshot.
>
> In that case you must be of the opinion that extended query protocol
> is a bad idea and we should get rid of it, and the same for prepared
> plans of all types.  What you're basically proposing is that simple
> query mode will act differently from other ways of submitting a query,
> and I don't think that's a good idea.

I don't see why anything I said would indicate that we shouldn't have
prepared plans. It is useful for users to have the option to parse
and plan before execution - especially if they want to execute the
same query repeatedly - and if they choose to make use of that
functionality, then we and they will have to deal with the fact that
things can change between plan time and execution time. If that means
we miss some optimization opportunities, so be it. But we needn't
deliver the semantics associated with the extended query protocol when
the user isn't using it; and the next time we bump the protocol
version we probably should give some thought to making sure that you
only need to use the extended query protocol when you explicitly want
to separate parse/plan from execution, and not just to get at some
other functionality that we've only chosen to provided using the
extended protocol.

> It might be sane if planning
> could be assumed to take zero time, but that's hardly true.

I still maintain that the length of planning is irrelevant; more, if
the planning and execution are happening in response to a single
protocol message, then the semantics of the query need not (and
perhaps even should not) depend on how much of that time is spent
planning and how much is spent executing.

>> I also think you are dismissing Simon's stable-expression-folding
>> proposal far too lightly.  I am not sure that the behavior he wants is
>> safe given the current details of our implementation - or even with my
>> patch; I suspect a little more than that is needed - but I am pretty
>> certain it's the behavior that users want and expect, and we should be
>> moving toward it, not away from it.  I have seen a significant number
>> of cases over the years where the query optimizer generated a bad plan
>> because it did less constant-folding than the user expected.
>
> This is just FUD, unless you can point to specific examples where
> Marti's patch won't fix it.  If that patch crashes and burns for
> some reason, then we should revisit this idea; but if it succeeds
> it will cover more cases than plan-time constant folding could.

I haven't reviewed the two patches in enough detail to have a clear
understanding of which use cases each one does and does not cover.
But, for example, you wrote this:

tgl> As far as partitioning goes, the correct solution there
tgl> is to move the partition selection to run-time, so we should not be
tgl> contorting query semantics to make incremental performance improvements
tgl> with the existing partitioning infrastructure.

...and I don't think I buy it. Certainly, being able to exclude
partitions at runtime would be *extremely* valuable, but it's not a
completely replacement for evaluating stable functions away prior to
planning, because the latter approach allows the planner to see the
function result and estimate the selectivity of that value
specifically, which may lead to a much more accurate estimate and a
completely different and far better plan. Logical purity is not, for
me, a sufficient reason to throw that type of optimization out the
window.

Now that having been said, I'm pretty interested by what Marti is doing, too.

> One of the reasons I don't want to go this direction is that it would
> re-introduce causes of extended query protocol having poor performance
> relative to simple protocol.  That's not something that users find
> intuitive or desirable, either.

Insisting that we refuse to optimize the simple query protocol is the
wrong solution to that problem.

For what it's worth, the best result I was able to get with the
patches I posted was about a 4% improvement on pgbench throughput
(with 24-32 concurrent clients on a 32-core machine). So we're not
talking about massively handicapping the extended query protocol. At
the same time, we could sweat a lot more blood in other areas of the
system for a lot less benefit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-11-13 23:16:48 Regression tests fail once XID counter exceeds 2 billion
Previous Message Tom Lane 2011-11-13 20:38:52 Cause of intermittent rangetypes regression test failures