Re: Sync Rep: First Thoughts on Code

From: "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
To: aidan(at)highrise(dot)ca
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sync Rep: First Thoughts on Code
Date: 2008-12-12 03:53:44
Message-ID: 3f0b79eb0812111953o1c5d6a37j9188dd059763b8d7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Dec 12, 2008 at 12:15 AM, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:
> * Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> [081211 10:09]:
>> Simon Riggs wrote:
>>> On Thu, 2008-12-11 at 09:27 -0500, Aidan Van Dyk wrote:
>>>
>>>> But "catchup" *has* to be *done* before PostgreSQL can enter "sync rep".
>>>
>>> Not true. Please reread the thread where Heikki questions that and I
>>> reply. This was Fujii-san's idea, which I now agree with.
>>
>> I think the confusion here is about what exactly "sync rep" means in
>> this situation. It's true that you can start streaming the WAL before
>> the standby has fully caught up. But from the client's point of view,
>> there's not much point in streaming the log *synchronously* and making
>> the client to wait for the acknowledment from the standby, if the
>> acknowledgment from the standby that WAL has be streamed up to point X,
>> doesn't actually guarantee that the slave can recover all the way to
>> that point.
>
> Quite possibly a terminology problem.. I my case I said "sync rep"
> meaning the mode such that the transaction doesn't commit successfully
> for my PG client until the xlog record has been "streamed" to the
> client... and I understand that at his presentation at PGcon, Fujii-san
> there could be possible variants on when the "streamed" is considered
> done based on network, slave ram, disk, application, etc.

I'd like to define the meaning of "synch rep" again. "synch rep" means:

(1) Transaction commit waits for WAL records to be replicated to the standby
before the command returns a "success" indication to the client.

(2) The standby has (can read) all WAL files indispensable for recovery.

If both are true, your system is in "synch rep"; you can perform failover safely
without any transaction loss whenever the primary falls down. On the other
hand, if either is false, your system is in not "synch rep" but "standalone";
the failure of the primary might cause a certain transaction loss. Starting the
standby doesn't mean "synch rep" directly. We have to wait for (1) *and* (2)
after starting the standby. (1) is reported as a server log message, so we can
wait for (1). (2) is somewhat complicated; if an archive is shared, the server
log message for achiving indicates (2). otherwise, The copy operation (copy
indispensable WAL files from the primary to the standby) by the user or
clusterware indicates (2). But, as Simon pointed out, since many people share
an archive, they should monitor only the server log messages. Or, should I
create the feature for the user to confirm whether it's in "synch rep" via SQL?

Since there is a little delay between (1) and (2), we can do WAL streaming
asynchronously only in the delay, as Heikki pointed out. But I'm not sure if
it's worth trying it.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-12-12 03:59:15 Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)
Previous Message Bruce Momjian 2008-12-12 03:51:09 Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)