Re: Repeatable read and serializable transactions see data committed after tx start

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Repeatable read and serializable transactions see data committed after tx start
Date: 2014-11-06 14:46:04
Message-ID: 1415285164.79092.YahooMailNeo@web122305.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Sabino Mullane <greg(at)turnstep(dot)com> wrote:
> Kevin Grittner wrote:

> (wording change suggestion)
>>> | sees a snapshot as of the start of the first query within the
>>> | transaction, not as of the start of the current query within the
>>> | transaction.
>>
>> Would that have prevented the confusion here?
>
> I think it may have, but I also think the wording should be much
> stronger and clearer, as this is unintuitive behavior. Consider
> this snippet from Bruce's excellent MVCC Unmasked presentation:
>
> "A snapshot is recorded at the start of each SQL statement in
> READ COMMITTED transaction isolation mode, and at transaction start
> in SERIALIZABLE transaction isolation mode."
>
> This is both correct and incorrect, depending on whether you consider
> a transaction to start with BEGIN; or with the first statement
> after the BEGIN. :) I think most people have always assumed that
> BEGIN starts the transaction and that is the point at which the snapshot
> is obtained.

But there is so much evidence to the contrary. Not only does the
*name* of the command (BEGIN or START) imply a start, but
pg_stat_activity shows the connection "idle in transaction" after
the command (and before a snapshot is acquired), the Explicit
Locking section of the docs asserts that "Once acquired, a lock is
normally held till end of transaction", and the docs for the SET
command assert that "The effects of SET LOCAL last only till the
end of the current transaction, whether committed or not." The end
of *which* transaction? The one that started with BEGIN or START
and which might not (or in some cases *must* not) yet have a
snapshot.

>>> But what about creating a flag to BEGIN and SET TRANSACTION
>>> commands, called "IMMEDIATE FREEZE" (or something similar), which
>>> applies only to REPEATABLE READ and SERIALIZABLE? If this flag is set
>>> (and may be off by default, but of course the default may be
>>> configurable via a guc parameter), freeze happens when it is present
>>> (BEGIN or SET TRANSACTION) time. This would be a backwards-compatible
>>> change, while would provide the option of freezing without the nasty
>>> hack of having to do a "SELECT 1" prior to your real queries, and
>>> everything will of course be well documented.
>>
>> What is the use case where you are having a problem? This seems
>> like an odd solution, so it would be helpful to know what problem
>> it is attempting to solve.
>
> Seems like a decent solution to me. The problem it that having to execute
> a dummy SQL statement to start a serializable transaction, rather
> than simply a BEGIN, is ugly.and error prone. Perhaps their app
> assumes (or even requires) that BEGIN starts the snapshot.

Why? This "fix" might not deal with the bigger issues that I
discussed, like that the later-to-start and
later-to-acquire-a-snapshot transaction might logically be first in
the apparent order of execution. You can't "fix" that without a
lot of blocking -- that most of us don't want. Depending on *why*
they think this is important, they might need to be acquiring
various locks to prevent behavior they don't want, in which case
having acquired a snapshot at BEGIN would be exactly the *wrong*
thing to do. The exact nature of the problem we're trying to solve
here does matter.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2014-11-06 14:50:58 Re: REINDEX CONCURRENTLY 2.0
Previous Message Fujii Masao 2014-11-06 14:38:13 Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: HEAD seems to generate larger WAL regarding GIN index