Re: Synchronous replication, network protocol

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Pavan Deolasee <pavan(dot)deolasee(at)enterprisedb(dot)com>
Subject: Re: Synchronous replication, network protocol
Date: 2008-12-23 17:53:55
Message-ID: 1230054835.4793.939.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, 2008-12-23 at 18:23 +0200, Heikki Linnakangas wrote:

> I don't think we need or should
> allow running regular queries before entering "replication mode". the
> backend should become a walsender process directly after authentication.

+1

> Standby -> primary
>
> RequestWAL <begin> <end>
> Primary should respond with a WALRange message containing the given
> range of WAL data.
>
> StartReplication <begin>
> Primary should send all already-generated WAL beginning from <begin>,
> and then keep sending as it's generated.

Can you give a quick example of how these would be used?

Fujii-san and others considered that having replication start early was
an important requirement. If we do these operations serially on the same
connection
* copy all bulk data
* start streaming
then there is a considerable delay before replication can begin. In the
case of some large sites, perhaps as long as 18-24 hrs.

> ReplicatedUpTo <end>
> Acknowledge that all WAL up to <end> has been successfully received and
> written to disk and/or fsync'd (depending on the replication mode in
> use). The primary can use this information to acknowledge a transaction
> as committed to the client in case of synchronous replication.

+1

> Primary -> standby
>
> WALRange <begin> <end> <data>
> Response to RequestWAL or StartReplication message. After receiving a
> StartReplication message, the primary can send these messages when it
> feels like it. In synchronous mode, that would be at least at each
> commit. The standby should respond with a ReplicatedUpTo message to each
> WALRange message.

+1

> (later) RequestBaseBackup
> Request a new base backup to be sent. This can be used to initialize a
> new slave.

> (later) BaseBackup <data>
> A base backup, in response to RequestBaseBackup message. For example,
> in .tar.gz format.

Experience from Slony shows that single-threading the initial data send
is not a great idea for large databases, since it limits the bandwidth
even if you have more available. (Slony has no choice because of the
current single-transaction=> single-thread requirement). Being able to
take a base backup in parallel is an important feature with large
databases. I think we need to offer an option here rather than force use
of a single thread, though that may be a more convenient option for many
people I would agree.

Rumour has it that Slony might move towards a synchronisation that used
a base backup and PITR as its starting point.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2008-12-23 17:58:29 Re: incoherent view of serializable transactions
Previous Message Fujii Masao 2008-12-23 17:51:17 Re: Synchronous replication, reading WAL for sending