Re: Serializable Snapshot Isolation

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <gsstark(at)mit(dot)edu>
Cc: <drkp(at)csail(dot)mit(dot)edu>,<heikki(dot)linnakangas(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Serializable Snapshot Isolation
Date: 2010-09-25 22:28:05
Message-ID: 4C9E31250200002500035DC6@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark wrote:

> So T1 must have happened before TN because it wrote something based
> on data as it was before TN modified it. But T0 can see TN but not
> T1 so there's no complete ordering between the three transactions
> that makes them all make sense.

Correct.

> The thing is that the database state is reasonable, the database
> state is after it would be if the ordering were T1,TN with T0
> happening any time. And the backup state is reasonable, it's as if
> it occurred after TN and before T1. They just don't agree.

I agree that the database state eventually "settles" into a valid
long-term condition in this particular example. The point you are
conceding seems to be that the image captured by pg_dump is not
consistent with that. If so, I agree. You don't see that as a
problem; I do. I'm not sure where we go from there. Certainly that
is better than making pg_dump vulnerable to serialization failure --
if we don't implement the SERIALIZABLE READ ONLY DEFERRABLE
transactions I was describing, we can change pg_dump to use
REPEATABLE READ and we will be no worse off than we are now.

The new feature I was proposing was that we create a SERIALIZABLE
READ ONLY DEFERRABLE transaction style which would, rather than
acquiring predicate locks and watching for conflicts, potentially
wait until it could acquire a snapshot which was guaranteed to be
conflict-free. In the example discussed on this thread, if we
changed pg_dump to use such a mode, when it went to acquire a
snapshot it would see that it overlapped T1, which was not READ ONLY,
which in turn overlapped TN, which had written to a table and
committed. It would then block until completion of the T1
transaction and adjust its snapshot to make that transaction visible.
You would now have a backup entirely consistent with the long-term
state of the database, with no risk of serialization failure and no
bloating of the predicate lock structures.

The only down side is that there could be blocking when such a
transaction acquires its snapshot. That seems a reasonable price to
pay for backup integrity. Obviously, if we had such a mode, it would
be trivial to add a switch to the pg_dump command line which would
let the user choose between guaranteed dump integrity and guaranteed
lack of blocking at the start of the dump.

-Kevin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2010-09-25 22:38:14 Re: Serializable Snapshot Isolation
Previous Message Greg Stark 2010-09-25 18:03:53 Re: bg worker: general purpose requirements