SSI and Hot Standby

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-hackers(at)postgresql(dot)org>
Cc: "Dan Ports" <drkp(at)csail(dot)mit(dot)edu>
Subject: SSI and Hot Standby
Date: 2011-01-20 01:05:07
Message-ID: 4D3735E30200002500039869@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Here's an issue for feedback from the community -- do we want to
support truly serializable transactions on hot standby machines?

The best way Dan and I have been able to think to do this is to
build on the SERIALIZABLE READ ONLY DEFERRABLE behavior. We are
able to obtain a snapshot and then check to see if it is at a place
in the transaction processing that it would be guaranteed to be
serializable without participating in predicate locking, rw-conflict
detection, etc. If it's not, we block until a READ WRITE
transaction completes, and then check again. Repeat. We may reach
a point where we determine that the snapshot can't work, and we get
a new one and start over. Due to the somewhat complex rules for
this, you are likely to see a safe snapshot fairly quickly even in a
mix which always has short-lived READ WRITE transactions running,
although a single long-running READ WRITE transaction can block
things until it completes.

The idea is that whenever we see a valid snapshot which would yield
a truly serializable view of the data for a READ ONLY transaction,
we add a WAL record with that snapshot information. Of course, we
might want some limit of how often they are sent, to avoid WAL
bloat. A hot standby could just keep the most recently received of
these and use it when a SERIALIZABLE transaction is requested.
Perhaps DEFERRABLE in this context could mean that it waits for the
*next* one and uses it, to assure "freshness".

Actually, we could try to get tricky to avoid sending a complete
snapshot by having two WAL messages with no payload -- one would
mean "the snapshot you would get now is being tested for
serializability". If it failed reach that state we would send
another when we started working a new snapshot. The other type of
message would mean "the snapshot you built when we last told you we
were starting to test one is good." I *think* that can work, and it
may require less WAL space.

If we don't do something like this, do we just provide REPEATABLE
READ on the standby as the strictest level of transaction isolation?
If so, do we generate an error on a request for SERIALIZABLE, warn
and provide degraded behavior, or just quietly give them REPEATABLE
READ behavior?

Thoughts?

-Kevin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-01-20 01:21:14 Re: SSI and Hot Standby
Previous Message Simon Riggs 2011-01-20 00:57:23 Re: ALTER TABLE ... REPLACE WITH