Re: Synchronous replication, network protocol

From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Pavan Deolasee" <pavan(dot)deolasee(at)enterprisedb(dot)com>
Subject: Re: Synchronous replication, network protocol
Date: 2008-12-23 18:42:48
Message-ID: 3f0b79eb0812231042n14b57a66p7b8d7cb726c5d3c0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for clarifying!

On Wed, Dec 24, 2008 at 2:53 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On Tue, 2008-12-23 at 18:23 +0200, Heikki Linnakangas wrote:
>
>> I don't think we need or should
>> allow running regular queries before entering "replication mode". the
>> backend should become a walsender process directly after authentication.
>
> +1

OK, I will re-examine it. But, at least, we need to send ReadyForQuery
message after authentication before sending WAL, because walreceiver
uses libpq (PQsetdbLogin), which doesn't return until receiving
ReadyForQuery.

>
>> Standby -> primary
>>
>> RequestWAL <begin> <end>
>> Primary should respond with a WALRange message containing the given
>> range of WAL data.
>>
>> StartReplication <begin>
>> Primary should send all already-generated WAL beginning from <begin>,
>> and then keep sending as it's generated.
>
> Can you give a quick example of how these would be used?
>
> Fujii-san and others considered that having replication start early was
> an important requirement. If we do these operations serially on the same
> connection
> * copy all bulk data
> * start streaming
> then there is a considerable delay before replication can begin. In the
> case of some large sites, perhaps as long as 18-24 hrs.

Agreed. In very busy system, if those operations are performed serially,
we might not be able to start streaming. I mean that the speed to
generate WAL might be higher than that to copy them.

>
>> ReplicatedUpTo <end>
>> Acknowledge that all WAL up to <end> has been successfully received and
>> written to disk and/or fsync'd (depending on the replication mode in
>> use). The primary can use this information to acknowledge a transaction
>> as committed to the client in case of synchronous replication.
>
> +1

Yes.

>
>> Primary -> standby
>>
>> WALRange <begin> <end> <data>
>> Response to RequestWAL or StartReplication message. After receiving a
>> StartReplication message, the primary can send these messages when it
>> feels like it. In synchronous mode, that would be at least at each
>> commit. The standby should respond with a ReplicatedUpTo message to each
>> WALRange message.
>
> +1

Currently, <begin> is not sent because it can be calculated from <end> and
data length. This would decrease a network traffic in some degree.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2008-12-23 18:48:52 Re: incoherent view of serializable transactions
Previous Message Jeff Davis 2008-12-23 18:34:41 Re: Lock conflict behavior?