Re: SERIALIZABLE on standby servers

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SERIALIZABLE on standby servers
Date: 2016-11-15 20:26:03
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On Tue, Nov 8, 2016 at 5:56 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Here is an experimental WIP patch to allow SERIALIZABLE READ ONLY
> DEFERRABLE transactions on standby servers without serialisation
> anomalies, based loosely on an old email from Kevin Grittner[1]. I'm
> not sure how far this is from what he had in mind or whether I've
> misunderstood something fundamental here, but I hope this can at least
> serve as a starting point and we can try to get something into
> Postgres 10.

While out walking I realised what was wrong with that. It's going to
take me a while to find the time to get back to this, so I figured I
should share this realisation in case anyone else is interested in the

The problem is that it determines snapshot safety in
PreCommit_CheckForSerializationFailure, and then races other backends
to XactLogCommitRecord. It could determine that a hypothetical
snapshot taken after this commit is safe, but then other activity
resulting in a hypothetical snapshot of unknown safety could happen
and be logged before we record our determination in the log.

One solution could be to serialise XactLogCommitRecord for SSI
transactions using SerializableXactHashLock, and determine
hypothetical snapshot safety at the same time, so that commit replay
order matches safety determination order. But it would suck to add
another point of lock contention to SSI commits. Another solution
could be to have recovery on the standby detect tokens (CSNs
incremented by PreCommit_CheckForSerializationFailure) arriving out of
order, but I don't know what exactly it should do about that when it
is detected: you shouldn't respect an out-of-order claim of safety,
but then what should you wait for? Perhaps if the last replayed
commit record before that was marked SNAPSHOT_SAFE then it's OK to
leave it that way, and if it was marked SNAPSHOT_SAFETY_UNKNOWN then
you have to wait for that one to be resolved by a follow-up snapshot
safety message and then rince-and-repeat (take a new snapshot etc). I
think that might work, but it seems strange to allow random races on
the primary to create extra delays on the standby. Perhaps there is
some much simpler way to do all this that I'm missing.

Another detail is that standbys that start up from a checkpoint and
don't see any SSI transactions commit don't yet have any snapshot
safety information, but defaulting to assuming that this point is safe
doesn't seem right, so I suspect it needs to be in checkpoints.

Attached is a tidied up version which doesn't try to address the above
problems yet. When time permits I'll come back to this.

Thomas Munro

Attachment Content-Type Size
ssi-standby-v2.patch application/octet-stream 35.7 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-15 20:40:03 Re: Password identifiers, protocol aging and SCRAM protocol
Previous Message Brad DeJong 2016-11-15 20:23:43 Re: Snapshot too old logging