Re: Transaction Snapshots and Hot Standby

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshots and Hot Standby
Date: 2008-09-13 09:48:06
Message-ID: 48CB8C56.8080700@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> BTW, we haven't talked about how to acquire a snapshot in the slave.
> You'll somehow need to know which transactions have not yet
> committed, but will in the future. In the master, we keep track of
> in-progress transaction in the ProcArray, so I suppose we'll need to
> do the same in the slave. Very similar to prepared transactions,
> actually. I believe the Abort records, which are not actually needed
> for normal operation, become critical here. The slave will need to
> put an entry to ProcArray for any new XLogRecord.xl_xid it sees in
> the WAL, and remove the entry at a Commit and Abort record. And clear
> them all at a shutdown record.

For reference, here is how I solved the snapshot problem in my
Summer-of-Code project last year, which dealt exactly with executing
read-only queries on PITR slaves (But sadly never came out of alpha
stage due to both my and Simon's lack of time)

The main idea was to invert the meaning of the xid array in the snapshot
struct - instead of storing all the xid's between xmin and xmax that are
to be considering "in-progress", the array contained all the xid's >
xmin that are to be considered "completed".

The current read-only snapshot (which "current" meaning the
corresponding state on the master at the time the last replayed wal
record was generated) was maintained in shared memory. It' xmin field
was continually updated with the (newly added) XLogRecord.xl_xmin
field, which contained the xid of the oldest running query on the
master, with a pruning step after each ReadOnlySnapshot.xmin update to
remove all entries < xmin from the xid array. If a commit was seen for
an xid, that xid was added to the ReadOnlySnapshot.xid array.

The advantage of this concept is that it handles snapshotting on the
slave without too much additional work for the master (The only change
is the addition of the xl_xmin field to XLogRecord). It especially
removes that need to track ShmemVariableCache->nextXid.

The downside is that the size of the read-only snapshot is theoretically
unbounded, which poses a bit of a problem if it's supposed to live
inside shared memory...

regards, Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2008-09-13 17:09:35 Re: Noisy CVS updates
Previous Message Heikki Linnakangas 2008-09-13 07:59:04 Re: rmgr hooks and contrib/rmgr_hook