Synchronous replication, network protocol

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)enterprisedb(dot)com>
Subject: Synchronous replication, network protocol
Date: 2008-12-23 16:23:38
Message-ID: 4951108A.5040608@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The protocol between primary and standby haven't been discussed or
documented in detail.

I don't think it's enough to just stream WAL as it's generated, so
here's my proposal. Messages marked with "(later)" are for features that
have been discussed, but no one is implementing for 8.4. The messages
are sent like in the frontend/backend protocol. The handshake can work
like in the current patch, although I don't think we need or should
allow running regular queries before entering "replication mode". the
backend should become a walsender process directly after authentication.

Standby -> primary

RequestWAL <begin> <end>
Primary should respond with a WALRange message containing the given
range of WAL data.

StartReplication <begin>
Primary should send all already-generated WAL beginning from <begin>,
and then keep sending as it's generated.

ReplicatedUpTo <end>
Acknowledge that all WAL up to <end> has been successfully received and
written to disk and/or fsync'd (depending on the replication mode in
use). The primary can use this information to acknowledge a transaction
as committed to the client in case of synchronous replication.

(later) OldestXmin <xid>
When a hot standby server is running read-only queries, indicates the
current OldestXmin on the standby. The primary can refrain from
vacuuming tuples still required by the slave using this value, if so
configured. That will ensure that the standby doesn't need to stall WAL
application because of read-only queries.

(later) RequestBaseBackup
Request a new base backup to be sent. This can be used to initialize a
new slave.

Primary -> standby

WALRange <begin> <end> <data>
Response to RequestWAL or StartReplication message. After receiving a
StartReplication message, the primary can send these messages when it
feels like it. In synchronous mode, that would be at least at each
commit. The standby should respond with a ReplicatedUpTo message to each
WALRange message.

(later) BaseBackup <data>
A base backup, in response to RequestBaseBackup message. For example,
in .tar.gz format.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Mielke 2008-12-23 16:24:36 Re: Sync Rep: First Thoughts on Code
Previous Message Gregory Stark 2008-12-23 16:12:09 Re: incoherent view of serializable transactions