SERIALIZABLE on standby servers

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: SERIALIZABLE on standby servers
Date: 2016-11-08 04:56:52
Message-ID: CAEepm=2b9TV+vJ4UeSBixDrW7VUiTjxPwWq8K3QwFSWx0pTXHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi hackers

Here is an experimental WIP patch to allow SERIALIZABLE READ ONLY
DEFERRABLE transactions on standby servers without serialisation
anomalies, based loosely on an old email from Kevin Grittner[1]. I'm
not sure how far this is from what he had in mind or whether I've
misunderstood something fundamental here, but I hope this can at least
serve as a starting point and we can try to get something into
Postgres 10.

The patch works by teaching the standby how to do somethings similar
to what SERIALIZABLE READ ONLY DEFERRABLE does on the primary server,
with some help from the primary server in the form of extra
information in the WAL.

The basic idea is: the standby should wait until a point in the
transaction stream where it can take a snapshot and either (1) there
were no concurrent read/write SERIALIZABLE transactions running on the
primary, or (2) the last concurrent read/write SERIALIZABLE
transaction at snapshot time has now ended without creating a
dangerous cycle with our transaction.

In case (1), the primary sometimes adds an extra
xl_xact_snapshot_safety struct to commit messages which says 'a
snapshot taken after this commit and before any future SSI commits is
safe, because there are no concurrent read/write SSI transactions at
this moment'.

In case (2), the xl_xact_snapshot_safety struct embedded in a commit
record instead says 'a snapshot taken after this commit and before any
future SSI commits is of unknown safety, because there are concurrent
transactions; I'll tell you when it has been determined; please
remember this token'. The token (which happens to be a CSN but that
is not important) will appear in a future independent snapshot safety
message which says whether a snapshot taken at that time is safe or
unsafe.

Note that xl_xact_snapshot_safety is embedded in the commit messages
(for SSI transactions only), but also appears as its own WAL record to
report the final state of a token from an earlier commit. So if you
do a lot of non-overlapping writable SSI transactions, you'll get just
a few extra bytes in each commit record, but overlapping transactions
will generate a stream of extra snapshot safety messages, one for each
commit involved.

In order to generate follow-up snapshot safety messages, the patch
creates 'hypothetical' transactions on the primary whenever a
writeable SSI transaction commits, so that it can figure out whether
such a transaction would conflict. These phantom transactions are
proxies for any transaction that may be created on a standby at the
same point in the transaction stream (with respect to SSI commits) on
any standby, and survive in memory just until they are found to be
safe or unsafe.

Example of use:

T1 on primary: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
T1 on primary: INSERT INTO foo VALUES ('x');
T2 on primary: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
T2 on primary: INSERT INTO foo VALUES ('x');
T2 on primary: COMMIT;
T3 on standby: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE READ
ONLY DEFERRABLE;
T3 on standby: SELECT * FROM foo;
T3 on standby: <...waits...>
T1 on primary: COMMIT;
T3 on standby: <...continues...>

Not tested much and certainly has bugs and many details to sort out,
but first... is this sound or could it be made so? Is there a better
way?

[1] https://www.postgresql.org/message-id/4D3735E30200002500039869%40gw.wicourts.gov

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
ssi-standby-v1.patch application/octet-stream 27.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2016-11-08 05:01:35 Re: Exclude pg_largeobject form pg_dump
Previous Message Amit Kapila 2016-11-08 04:55:11 Re: Re: [COMMITTERS] pgsql: Change the way that LWLocks for extensions are allocated.