Quick Links

Re: Repeatable read and serializable transactions see data committed after tx start

From:	Álvaro Hernández Tortosa <aht(at)8Kdata(dot)com>
To:	Kevin Grittner <kgrittn(at)ymail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Repeatable read and serializable transactions see data committed after tx start
Date:	2014-11-08 20:48:28
Message-ID:	545E819C.9040300@8Kdata.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 06/11/14 15:00, Kevin Grittner wrote:
> Álvaro Hernández Tortosa <aht(at)8Kdata(dot)com> wrote:
>
>> There has been two comments which seem to state that changing this
>> may introduce some performance problems and some limitations when you
>> need to take out some locks. I still believe, however, that current
>> behavior is confusing for the user. Sure, one option is to patch the
>> documentation, as I was suggesting.
> Yeah, I thought that's what we were talking about, and in that
> regard I agree that the docs could be more clear. I'm not quite
> sure what to say where to fix that, but I can see how someone could
> be confused and have the expectation that once they have run BEGIN
> TRANSACTION ISOLATION LEVEL SERIALIZABLE the transaction will not
> see the work of transactions committing after that. The fact that
> this is possible is implied, if one reads carefully and thinks
> about it, by the statement right near the start of the "Transaction
> Isolation" section which says "any concurrent execution of a set of
> Serializable transactions is guaranteed to produce the same effect
> as running them one at a time in some order." As Robert pointed
> out, this is not necessarily the commit order or the transaction
> start order.
>
> It is entirely possible that if you have serializable transactions
> T1 and T2, where T1 executes BEGIN first (and even runs a query
> before T2 executes BEGIN) and T1 commits first, that T2 will
> "appear" to have run first because it will look at a set of data
> which T1 modifies and not see the changes. If T1 were to *also*
> look at a set of data which T2 modifies, then one of the
> transactions would be rolled back with a serialization failure, to
> prevent a cycle in the apparent order of execution; so the
> requirements of the standard (and of most software which is
> attempting to handle race conditions) is satisfied. For many
> popular benchmarks (and I suspect most common workloads) this
> provides the necessary protections with better performance than is
> possible using blocking to provide the required guarantees.[1]

Yes, you're right in that the "any concurrent execution..." phrase
implicitly means that snapshot may not be taken at BEGIN or SET
TRANSACTION time, but it's definitely not clear enough for the average
user. Yet this may apply to the serializable case, but it doesn't to the
repeatable read where the docs read " The Repeatable Read isolation
level only sees data committed before the transaction began; it never
sees either uncommitted data or changes committed during transaction
execution by concurrent transactions". The first part is confusing, as
we discussed; the second part is even more confusing as it says "during
transaction execution", and isn't the transaction -not the snapshot-
beginning at BEGIN time?

Surprisingly, the language is way more clear in the SET TRANSACTION
doc page [2].

>
> At any rate, the language in that section is a little fuzzy on the
> concept of the "start of the transaction." Perhaps it would be
> enough to change language like:
>
> | sees a snapshot as of the start of the transaction, not as of the
> | start of the current query within the transaction.
>
> to:
>
> | sees a snapshot as of the start of the first query within the
> | transaction, not as of the start of the current query within the
> | transaction.
>
> Would that have prevented the confusion here?

I think that definitely helps. But it may be better to make it even
more clear, more explicit. And offering a solution for the user who may
like the snapshot to be taken "at begin time", like suggesting to do a
"SELECT 1" query.
>
>> But what about creating a flag to BEGIN and SET TRANSACTION
>> commands, called "IMMEDIATE FREEZE" (or something similar), which
>> applies only to REPEATABLE READ and SERIALIZABLE? If this flag is set
>> (and may be off by default, but of course the default may be
>> configurable via a guc parameter), freeze happens when it is present
>> (BEGIN or SET TRANSACTION) time. This would be a backwards-compatible
>> change, while would provide the option of freezing without the nasty
>> hack of having to do a "SELECT 1" prior to your real queries, and
>> everything will of course be well documented.
> What is the use case where you are having a problem? This seems
> like an odd solution, so it would be helpful to know what problem
> it is attempting to solve.

I don't have a particular use case. I just came across the issue
and thought the documentation and behavior wasn't consistent. So the
first aim is not to have users surprised (in a bad way). But I see a
clear use case: users who might want to open a (repeatable read |
serializable) transaction to have their view of the database frozen, to
perform any later operation on that frozen view. Sure, that comes at a
penalty, but I see that potentially interesting too.

Regards,

Álvaro

[1] http://www.postgresql.org/docs/9.4/static/transaction-iso.html
[2] http://www.postgresql.org/docs/9.4/static/sql-set-transaction.html

--
Álvaro Hernández Tortosa

-----------
8Kdata

In response to

Re: Repeatable read and serializable transactions see data committed after tx start at 2014-11-06 14:00:03 from Kevin Grittner

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Álvaro Hernández Tortosa	2014-11-08 20:53:18	Re: Repeatable read and serializable transactions see data committed after tx start
Previous Message	Andreas Joseph Krogh	2014-11-08 20:16:31	Support for detailed description of errors cased by trigger-violations